We have a website maintained by an old employee and it appears it's encoded by Zend Guard including all backups.
I know a little about Zend Optimizer, but never considered it for source protection as I know in the end the bytecode will need to be decoded for the interpreter, and was sure people easily decode optimized files using some software.
Now I need to decode some files and I can't find anything but some 'paid services'. We have the ownership of the code and are locked out now for any changes and debugging. How can I decode our files back?
2 Answers
The entire point of the very expensive software tool Zend Guard is to encrypt code so that it can not be decoded. That is the point.
If obfuscation is not on, then there is a possibility that you may be able to get a professional to get the code back, less comments and formatting by means of hacking the code engine. If obfuscation is on, then it's easier to rewrite it to be honest.
Have a read of this article from the Zend site, I know it is a biased source but they are right: http://forums.zend.com/viewtopic.php?f=57&t=2242
Free tools all over the place can do this now:
protected by Community♦Sep 1 '13 at 23:17
Thank you for your interest in this question. Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
Not the answer you're looking for? Browse other questions tagged phpencryptionzend-optimizerzend-guard or ask your own question.
Is there a way to locate PHP files within a source tree that have been encoded using Zend Guard?
There are existingquestions about attempting to decode Zend-encoded files, but I've inherited a large PHP app that must be using a Zend-encoded file somewhere in some remote library because I keep getting the following output in my application's error log:
I have no idea where this file is in the app! I've been unable to locate any information on characteristics of files that have been encoded by Zend Guard, so I don't know what to search the filesystem for. Google has been unhelpful. Simple grep
s for 'userscape' 'helpspot' (apparently a product of UserScape) and even 'zend' come up blank.
EDIT: However, according to the FAQ Zend Guard uses public key crypto, so I'm fairly sure the files won't have any recognizable PHP code in them anyway.
Is there a generic way to locate Zend Guard-encoded PHP files in a filesystem? Are there common properties of the files that are searchable?
1 Answer
I believe most Zend Optimizer encoded files will begin with a header resembling:
They will also contain all of the text that you're seeing in the error message, including the words 'Zend Optimizer'. So you can just search for that. :)
Not the answer you're looking for? Browse other questions tagged phpzend-guard or ask your own question.
If I want to create a URL using a variable I have two choices to encode the string. urlencode()
and rawurlencode()
.
What exactly are the differences and which is preferred?
11 Answers
How To Decode Php Files That Encoded By Zend Encoder Definition Free
It will depend on your purpose. If interoperability with other systems is important then it seems rawurlencode is the way to go. The one exception is legacy systems which expect the query string to follow form-encoding style of spaces encoded as + instead of %20 (in which case you need urlencode).
rawurlencode follows RFC 1738 prior to PHP 5.3.0 and RFC 3986 afterwards (see http://us2.php.net/manual/en/function.rawurlencode.php)
Returns a string in which all non-alphanumeric characters except -_.~ have been replaced with a percent (%) sign followed by two hex digits. This is the encoding described in » RFC 3986 for protecting literal characters from being interpreted as special URL delimiters, and for protecting URLs from being mangled by transmission media with character conversions (like some email systems).
Note on RFC 3986 vs 1738. rawurlencode prior to php 5.3 encoded the tilde character (~
) according to RFC 1738. As of PHP 5.3, however, rawurlencode follows RFC 3986 which does not require encoding tilde characters.
urlencode encodes spaces as plus signs (not as %20
as done in rawurlencode)(see http://us2.php.net/manual/en/function.urlencode.php)
Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 3986 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.
This corresponds to the definition for application/x-www-form-urlencoded in RFC 1866.
Additional Reading:
You may also want to see the discussion at http://bytes.com/groups/php/5624-urlencode-vs-rawurlencode.
Also, RFC 2396 is worth a look. RFC 2396 defines valid URI syntax. The main part we're interested in is from 3.4 Query Component:
Within a query component, the characters ';', '/', '?', ':', '@',
are reserved.
'&', '=', '+', ',', and '$'
As you can see, the +
is a reserved character in the query string and thus would need to be encoded as per RFC 3986 (as in rawurlencode).
Proof is in the source code of PHP.
I'll take you through a quick process of how to find out this sort of thing on your own in the future any time you want. Bear with me, there'll be a lot of C source code you can skim over (I explain it). If you want to brush up on some C, a good place to start is our SO wiki.
Download the source (or use http://lxr.php.net/ to browse it online), grep all the files for the function name, you'll find something such as this:
PHP 5.3.6 (most recent at time of writing) describes the two functions in their native C code in the file url.c.
RawUrlEncode()
UrlEncode()
Okay, so what's different here?
They both are in essence calling two different internal functions respectively: php_raw_url_encode and php_url_encode
So go look for those functions!
Lets look at php_raw_url_encode
And of course, php_url_encode:
One quick bit of knowledge before I move forward, EBCDIC is another character set, similar to ASCII, but a total competitor. PHP attempts to deal with both. But basically, this means byte EBCDIC 0x4c byte isn't the L
in ASCII, it's actually a <
. I'm sure you see the confusion here.
Both of these functions manage EBCDIC if the web server has defined it.
Also, they both use an array of chars (think string type) hexchars
look-up to get some values, the array is described as such:
Beyond that, the functions are really different, and I'm going to explain them in ASCII and EBCDIC.
Differences in ASCII:
URLENCODE:
- Calculates a start/end length of the input string, allocates memory
- Walks through a while-loop, increments until we reach the end of the string
- Grabs the present character
- If the character is equal to ASCII Char 0x20 (ie, a 'space'), add a
+
sign to the output string. - If it's not a space, and it's also not alphanumeric (
isalnum(c)
), and also isn't and_
,-
, or.
character, then we , output a%
sign to array position 0, do an array look up to thehexchars
array for a lookup foros_toascii
array (an array from Apache that translates char to hex code) for the key ofc
(the present character), we then bitwise shift right by 4, assign that value to the character 1, and to position 2 we assign the same lookup, except we preform a logical and to see if the value is 15 (0xF), and return a 1 in that case, or a 0 otherwise. At the end, you'll end up with something encoded. - If it ends up it's not a space, it's alphanumeric or one of the
_-.
chars, it outputs exactly what it is.
RAWURLENCODE:
- Allocates memory for the string
- Iterates over it based on length provided in function call (not calculated in function as with URLENCODE).
Note: Many programmers have probably never seen a for loop iterate this way, it's somewhat hackish and not the standard convention used with most for-loops, pay attention, it assigns x
and y
, checks for exit on len
reaching 0, and increments both x
and y
. I know, it's not what you'd expect, but it's valid code.
- Assigns the present character to a matching character position in
str
. - It checks if the present character is alphanumeric, or one of the
_-.
chars, and if it isn't, we do almost the same assignment as with URLENCODE where it preforms lookups, however, we increment differently, usingy++
rather thanto[1]
, this is because the strings are being built in different ways, but reach the same goal at the end anyway. - When the loop's done and the length's gone, It actually terminates the string, assigning the
0
byte. - It returns the encoded string.
Differences:
- UrlEncode checks for space, assigns a + sign, RawURLEncode does not.
- UrlEncode does not assign a
0
byte to the string, RawUrlEncode does (this may be a moot point) - They iterate differntly, one may be prone to overflow with malformed strings, I'm merely suggesting this and I haven't actually investigated.
They basically iterate differently, one assigns a + sign in the event of ASCII 20.
Differences in EBCDIC:
URLENCODE:
- Same iteration setup as with ASCII
- Still translating the 'space' character to a + sign. Note-- I think this needs to be compiled in EBCDIC or you'll end up with a bug? Can someone edit and confirm this?
- It checks if the present char is a char before
0
, with the exception of being a.
or-
, OR less thanA
but greater than char9
, OR greater thanZ
and less thana
but not a_
. OR greater thanz
(yeah, EBCDIC is kinda messed up to work with). If it matches any of those, do a similar lookup as found in the ASCII version (it just doesn't require a lookup in os_toascii).
RAWURLENCODE:
- Same iteration setup as with ASCII
- Same check as described in the EBCDIC version of URL Encode, with the exception that if it's greater than
z
, it excludes~
from the URL encode. - Same assignment as the ASCII RawUrlEncode
- Still appending the
0
byte to the string before return.
Grand Summary
- Both use the same hexchars lookup table
- URIEncode doesn't terminate a string with 0, raw does.
- If you're working in EBCDIC I'd suggest using RawUrlEncode, as it manages the
~
that UrlEncode does not (this is a reported issue). It's worth noting that ASCII and EBCDIC 0x20 are both spaces. - They iterate differently, one may be faster, one may be prone to memory or string based exploits.
- URIEncode makes a space into
+
, RawUrlEncode makes a space into%20
via array lookups.
Disclaimer: I haven't touched C in years, and I haven't looked at EBCDIC in a really really long time. If I'm wrong somewhere, let me know.
Suggested implementations
Based on all of this, rawurlencode is the way to go most of the time. As you see in Jonathan Fingland's answer, stick with it in most cases. It deals with the modern scheme for URI components, where as urlencode does things the old school way, where + meant 'space.'
If you're trying to convert between the old format and new formats, be sure that your code doesn't goof up and turn something that's a decoded + sign into a space by accidentally double-encoding, or similar 'oops' scenarios around this space/20%/+ issue.
If you're working on an older system with older software that doesn't prefer the new format, stick with urlencode, however, I believe %20 will actually be backwards compatible, as under the old standard %20 worked, just wasn't preferred. Give it a shot if you're up for playing around, let us know how it worked out for you.
Basically, you should stick with raw, unless your EBCDIC system really hates you. Most programmers will never run into EBCDIC on any system made after the year 2000, maybe even 1990 (that's pushing, but still likely in my opinion).
yields
while
yields
The difference being the asd%20asd
vs asd+asd
urlencode differs from RFC 1738 by encoding spaces as +
instead of %20
One practical reason to choose one over the other is if you're going to use the result in another environment, for example JavaScript.
In PHP urlencode('test 1')
returns 'test+1'
while rawurlencode('test 1')
returns 'test%201'
as result.
But if you need to 'decode' this in JavaScript using decodeURI() function then decodeURI('test+1')
will give you 'test+1'
while decodeURI('test%201')
will give you 'test 1'
as result.
In other words the space (' ') encoded by urlencode to plus ('+') in PHP will not be properly decoded by decodeURI in JavaScript.
In such cases the rawurlencode PHP function should be used.
I believe spaces must be encoded as:
%20
when used inside URL path component+
when used inside URL query string component or form data (see 17.13.4 Form content types)
The following example shows the correct use of rawurlencode
and urlencode
:
Output:
What happens if you encode path and query string components the other way round? For the following example:
- The webserver will look for the directory
latest+songs
instead oflatest songs
- The query string parameter
q
will containlady gaga
The difference is in the return values, i.e:
urlencode():
Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 1738 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.
rawurlencode():
Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits. This is the encoding described in » RFC 1738 for protecting literal characters from being interpreted as special URL delimiters, and for protecting URLs from being mangled by transmission media with character conversions (like some email systems).
The two are very similar, but the latter (rawurlencode) will replace spaces with a '%' and two hex digits, which is suitable for encoding passwords or such, where a '+' is not e.g.:
1. What exactly are the differences and
The only difference is in the way spaces are treated:
urlencode - based on legacy implementation converts spaces to +
rawurlencode - based on RFC 1738 translates spaces to %20
The reason for the difference is because + is reserved and valid (unencoded) in urls.
2. which is preferred?
I'd really like to see some reasons for choosing one over the other ... I want to be able to just pick one and use it forever with the least fuss.
Fair enough, I have a simple strategy that I follow when making these decisions which I will share with you in the hope that it may help.
I think it was the HTTP/1.1 specification RFC 2616 which called for 'Tolerant applications'
Clients SHOULD be tolerant in parsing the Status-Line and servers tolerant when parsing the Request-Line.
When faced with questions like these the best strategy is always to consume as much as possible and produce what is standards compliant.
So my advice is to use rawurlencode
to produce standards compliant RFC 1738 encoded strings and use urldecode
to be backward compatible and accomodate anything you may come across to consume.
Now you could just take my word for it but lets prove it shall we...
It would appear that PHP had exactly this in mind, even though I've never come across anyone refusing either of the two formats, I cant think of a better strategy to adopt as your defacto strategy, can you?
nJoy!
urlencode: This differs from the » RFC 1738 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.
I believe urlencode is for query parameters, whereas the rawurlencode is for the path segments. This is mainly due to %20
for path segments vs +
for query parameters. See this answer which talks about the spaces: When to encode space to plus (+) or %20?
However %20
now works in query parameters as well, which is why rawurlencode is always safer. However the plus sign tends to be used where user experience of editing and readability of query parameters matter.
Note that this means rawurldecode
does not decode +
into spaces (http://au2.php.net/manual/en/function.rawurldecode.php). This is why the $_GET is always automatically passed through urldecode
, which means that +
and %20
are both decoded into spaces.
If you want the encoding and decoding to be consistent between inputs and outputs and you have selected to always use +
and not %20
for query parameters, then urlencode
is fine for query parameters (key and value).
The conclusion is:
Path Segments - always use rawurlencode/rawurldecode
Query Parameters - for decoding always use urldecode (done automatically), for encoding, both rawurlencode or urlencode is fine, just choose one to be consistent, especially when comparing URLs.
Spaces encoded as %20
vs. +
The biggest reason I've seen to use rawurlencode()
in most cases is because urlencode
encodes text spaces as +
(plus signs) where rawurlencode
encodes them as the commonly-seen %20
:
I have specifically seen certain API endpoints that accept encoded text queries expect to see %20
for a space and as a result, fail if a plus sign is used instead. Obviously this is going to differ between API implementations and your mileage may vary.
simple * rawurlencode the path - path is the part before the '?'- spaces must be encoded as %20 * urlencode the query string - Query string is the part after the '?'-spaces are better encoded as '+'= rawurlencode is more compatible generally
Not the answer you're looking for? Browse other questions tagged phpurlencodeurl-encoding or ask your own question.
Table of Contentsgenerated with DocToc
- BLENC encoder
Preface
This wiki page discusses a few open source solutions for php source code protection. This page is outdated, it was written as of May 2017 and for PHP 5. See many more solutions for code protection on the KB page: How do I protect PHP sources in the 'www' directory?.
bcompiler
bcompiler is a PECL extension and can compile php scripts to opcode/bytecode. Unfortunately it does not seem to be developed anymore. The last PHP version officially supported seems to be PHP 5.3. Building bcompiler extension fails on PHP 5.4, see PHP bug #60618.
Compiling php script to bytecode can be done using this code:
Links:
BLENC encoder
NOTE: Blenc encoder it not supported anymore with latest PHP 7. Supported versions are: PHP 5.3 / 5.4 / 5.5 / 5.6.
From PHP documentation:
BLENC is a PECL extension that permits to protect PHP source scripts with Blowfish Encryption. BLENC hooks into the Zend Engine, allowing for transparent execution of PHP scripts previously encoded with BLENC. It is not designed for complete security (it is still possible to disassemble the script into opcode/bytecode using a package such as XDebug), however it does keep people out of your code and make reverse engineering difficult.
To increase security it is recommended to compile BLENC extension from sources with your unique encryption key embedded in a DLL. Your source code will be more difficult to decrypt.
Documentation does not state clearly whether BLENC also does compile code to opcode/bytecode before encrypting it. Probably not. See the bcompiler extension that is also presented on this wiki page that does that.
Links:
Limitations
As of testing BLENC 1.1.4b the following restrictions were noticed:
- pecl.php.net page states that this extension is in BETA stage. PHP documentation states that this extension is experimental. The behaviour of this extension including the names of its functions and any other documentation surrounding this extension may change without notice in a future release of PHP. This extension should be used at your own risk.
- Script to be encoded can only contain PHP code. Only a single php opening tag at the beginning of file and a single php closing tag at the end of file are allowed when using 'blenc_encode.php' script. See comments in that script for more details. Reported PHP bug #68487 in regards to confusing example on blenc_encrypt() manual page.
- When you set php.ini 'blenc.key_file' to '.blenc_keys' and run 'blenc_myscript_encoded.php' then the following error will occur: 'Fatal error: blenc_compile: Validation of script ./blenc_myscript_encoded.php failed'. It seems that when you run the encoded script directly then BLENC looks for the key in phpdesktop executable directory. Setting path to './../www/.blenc_keys', as found in this tutorial, was done only for simplicity of the examples, it will work only when all scripts reside in the root directory. Reported as PHP bug #68488.
- One solution is to put blenc keys file in 'phpdesktop/.blenc_keys' after the process of encryption have been completed and app is ready to be distributed.
- Another solution is to use an absolute path, which can be set by application installer.
- BLENC has issues when multiple redistributable keys are put to '.blenc_keys' file. The solution is to use a fixed encryption key, and that will generate a single unique redistributable key for all encrypted php scripts. See BLENC_ENCRYPTION_KEY in the example down the page. Reported as PHP bug #68490.
Step by step tutorial
Download BLENC dll extension using the windows.php.net link up on the page. For example for PHP 5.4 download 'php_blenc-1.1.4b-5.4-nts-vc9-x86.zip'. For PHP 5.6 it would be '-5.6-nts-vc11-x86.zip'. There must be 'nts' (assuming you're using non-thread-safe version of php) and 'x86' (32bit) strings.
Extract zip file. Copy 'php_blenc.dll' to phpdesktop/php/ directory (or php/ext/ directory). Edit php.ini and add this line:
See the 'Limitations' section up the page on why the strange path for 'blenc.key_file'. You can also set the 'blenc.key_file' value in a php script using ini_set()
.
Create 'blenc_encode.php' script with contents below. Change BLENC_ENCRYPTION_KEY to some unique hard to guess string and keep it secret.
Create 'blenc_myscript.php' script:
Run 'blenc_encode.php' script.
Run 'blenc_myscript_encoded.php' script.
The encoded file's source will look like:
Other links
- PECL extensions binaries can be downloaded from here. For PHP 5.4 download file ending with '-5.4-nts-vc9-x86.zip': http://windows.php.net/downloads/pecl/releases/
- opcache - there is
opcache_compile_file()
, but no way to retrieve bytecode contents of a script - WinCache - also no API to retrieve bytecode contents
- APC cache - loading extension results in error when using PHP CGI interface
I tried using mycrypt with key and base64 to encrypt and then decode the code, but the code is in a variable so when i output this using eval, i am always getting errors so could you point me in the right direction, I also looked at building my own php extension but i wouldn't know how to output it into working php code.
UPDATE
I have got it to work, now I am going to convert it into an extension, I am just wondering can people decompile php extensions?
2 Answers
'can people decompile php extensions?'
Yes, it's certainly possible to reverse engineer and/or decompile compiled C code back to pseudo-code or source, but with your approach no one is going to need to in order to expose the code that you believe that you are protecting as in reality it is merely hidden.
The eval() function that you are calling is part of the opensource PHP core, and the source code could be trivially exposed either by modifying the eval() module function or the function referenced by the zend_compile_string function pointer (typically this is the address of the compile_string function).
Systems such as Zend and ionCube operate on compiled code (which PHP always produces ready for execution), and it's the bytecode that is encoded. Consequently there is no source code in encoded files to be restored at runtime. Additionally, a required component on the server may also contain a closed source execution engine rather than passing restored bytecode to the default bytecode execution engine in the PHP core, keeping bytecode more hidden and giving the opportunity to execute bytecode that does not conform to the usual PHP bytecode structure (hence needing more reverse engineering effort to understand it).
Why would you want to write your own encoder? Please, don't. The problem is that, at some point you will need to decode it into plain PHP code to feed to the PHP interpreter. And at that point someone can just come it and dump the code to a file.
Professional solutions like Zend_Guard and ionCube are the only solutions that actually work and are not hackable in 15 minutes by anyone with minimal PHP knowledge.