How to obfuscate Python code effectively

Question

I am looking for how to hide my Python source code   print  Hello World      How can I encode this example so that it isn t human-readable   I ve been told to use base64 but I m not sure how

User · Answer

I would mask the code like this   def MakeSC        c   raw input   Encode         sc      x       x  join   0 x   format ord c   for c in c      print   n shellcode        sc        exec shellcode    MakeSC      Cleartext   import os  os system  whoami     Encoded   Payload      x69 x6d x70 x6f x72 x74 x20 x6f x73 x3b x20 x6f x73 x2e x73 x79 x73 x74 x65 x6d x28 x22 x77 x68 x6f x61 x6d x69 x22 x29    exec Payload

User · Answer

I know it is an old question  Just want to add my funny obfuscated  quot Hello world  quot  in Python 3 and some tips        written in c      include  lt iostream h gt   define true false import os n   int input     STACK CALS         i CountCals     0x00  while os urandom 0x00  gt  gt  0x01  or  1  amp  True      i CountCals     0o0 break   call shell command echo  quot hello world quot   gt  text txt  quot  quot  print hello    cal    getattr    builtins      c DATATYPE hFILE radnom   0x00     h  -1   getRndint  3  lower      o0wiXSysRdrct    eval      cal   0x63      cal   104     r RUN CALLER  0      i1CLS NATIVE   getattr    builtins      cal   101    cal   118     o0wiXSysRdrct   0b1100001    LINE 2  0  lower     line 2 kernel call   executeMAIN 0x07453320abef    i1CLS NATIVE    map    def  Main        raise 0x06 return 0   exit program with exit code 0 def  0o7af    i1CLS NATIVE   int  replace       programMain   2       join     executeMAIN 0x07453320abef   o0wiXSysRdrct   STACK CALS    return  Main   for  INCREAMENT in  0  1024       STACK CALS   0x000  gt  gt  0x001  True amp False amp True amp False   c      h    e    l    o        w    o    r    l    d        if for  INCREAMENT in  0  1024       STACK CALS   40  111  41  46  46    n       quot  quot  quot  quot  quot  quot  print word  while True      break   0o7af    while os urandom 0x00  gt  gt  0xfa  or  1  amp  True     print  quot Hello  world  quot     i CountCals  -  0o0 break    while os urandom 0x00  gt  gt  0x01  or  1  amp  True          i CountCals      0o0        break   It is possible to do manually  my tips are   use eval and or exec with encrypted strings  use  ord i  for i in s       join map chr   list of chars goes here    as simple encryption decryption  use obscure variable names  make it unreadable  Don t write just 1 or True  write 1 amp True amp 0x00000001     use different number systems  add confusing comments like  quot line 2 quot  on line 10 or  quot it returns 0 quot  on while loop   use   builtins    use getattr and setattr

User · Answer

I ll write my answer in a didactic manner     First type into your Python interpreter   import this   then  go and take a look to the file this py in your Lib directory within your Python distribution and try to understand what it does   After that  take a look to the eval function in the documentation   help eval    Now you should have found a funny way to protect your code  But beware  because that only works for people that are less intelligent than you   and I m not trying to be offensive  anyone smart enough to understand what you did could reverse it

User · Answer

There are multiple ways to obfuscate code  Here s just one example    lambda                                                         getattr            import   True   class     name             class     name                      class     eq     class     name                       iter       class     name                                      lambda                                         lambda                             chr                                 if     else                  lambda     func code co lnotab                 lt  lt                                  lt  lt              lt  lt         lt  lt         -                   lt  lt                  -     lt  lt             lt  lt           lt  lt              lt  lt                     lt  lt                  -     lt  lt          lt  lt              lt  lt             lt  lt                                 lt  lt             lt  lt       lt  lt                              lt  lt        -     lt  lt                         lt  lt                 lt  lt        -     lt  lt           lt  lt            lt  lt                  -     -           lt  lt           lt  lt      -     lt  lt                                    lt  lt          lt  lt              lt  lt        -          lt  lt              lt  lt                        lt  lt           lt  lt            lt  lt                     lt  lt      -     lt  lt                      lt  lt              lt  lt                 lt  lt             lt  lt           lt  lt                             lt  lt                lt  lt                                lambda                                      lambda                                  lambda     func code co nlocals                                lambda       func code co nlocals    if     else                       lambda      func code co argcount                        lambda                   lambda                       lambda                            lambda                                  lambda                                         lambda                                                 lambda                                                          lambda

User · Answer

Maybe you can try on pyconcrete

encrypt .pyc to .pye and decrypt when import it

encrypt & decrypt by library OpenAES

Usage

Full encrypted

convert all of your .py to *.pye

$ pyconcrete-admin.py compile --source={your py script}  --pye
$ pyconcrete-admin.py compile --source={your py module dir} --pye

remove *.py *.pyc or copy *.pye to other folder
main.py encrypted as main.pye, it can't be executed by normal python. You must use pyconcrete to process the main.pye script. pyconcrete(exe) will be installed in your system path (ex: /usr/local/bin)
```
pyconcrete main.pye
src/*.pye  # your libs
```

Partial encrypted (pyconcrete as lib)

download pyconcrete source and install by setup.py

$ python setup.py install \
  --install-lib={your project path} \
  --install-scripts={where you want to execute pyconcrete-admin.py and pyconcrete(exe)}

import pyconcrete in your main script

recommendation project layout

main.py       # import pyconcrete and your lib
pyconcrete/*  # put pyconcrete lib in project root, keep it as original files
src/*.pye     # your libs

User · Answer

Opy  https   github com QQuick Opy     Opy will obfuscate your extensive  real world  multi module Python   source code for free  And YOU choose per project what to obfuscate and   what not  by editing the config file   You can recursively exclude all identifiers of certain modules from obfuscation  You can exclude human readable configuration files containing Python code  You can use getattr  setattr  exec and eval by excluding the identifiers they use  You can even obfuscate module file names and string literals  You can run your obfuscated code from any platform     Unlike some of the other options posted  this works for both Python 2 and 3  It is also free   opensource  and it is not an online only tool  unless you pay  like some of the others out there    I am admittedly still evaluating this myself  but all of initial tests of it worked perfectly   It appears this is exactly what I was looking for   The official version runs as a standalone utility  with the original intended design being that you drop a script into the root of the directory you want to obfuscate  along with a config file to define the details options you want to employ   I wasn t in love with that plan  so I added a fork from project  allowing you to import and utilize the tool from a library instead   That way  you can roll this directly into a more encompassing packaging script   You could of course wrap multiple py scripts in bash batch  but I think a pure python solution is ideal   I requested my fork be merged into the original work  but in case that never happens  here s the url to my revised version   https   github com BuvinJT Opy

User · Answer

Try pasting your hello world python code to the following site:

http://enscryption.com/encrypt-and-obfuscate-scripts.html

It will produce a complex encrypted and obfuscated, but fully functional script for you. See if you can crack the script and reveal the actual code. Or see if the level of complexity it provides satisfies your need for peace of mind.

The encrypted script that is produced for you through this site should work on any Unix system that has python installed.

If you would like to encrypt another way, I strongly suggest you write your own encryption/obfuscation algorithm (if security is that important to you). That way, no one can figure out how it works but you. But, for this to really work, you have to spend a tremendous amount of time on it to ensure there aren't any loopholes that someone who has a lot of time on their hands can exploit. And make sure you use tools that are already natural to the Unix system... i.e. openssl or base64. That way, your encrypted script is more portable.

User · Answer

You can use the base64 module to encode strings to stop shoulder surfing  but it s not going to stop someone finding your code if they have access to your files   You can then use the compile   function and the eval   function to execute your code once you ve decoded it    gt  gt  gt  import base64  gt  gt  gt  mycode    print  Hello World     gt  gt  gt  secret   base64 b64encode mycode   gt  gt  gt  secret  cHJpbnQgJ2hlbGxvIFdvcmxkICEn   gt  gt  gt  mydecode   base64 b64decode secret   gt  gt  gt  eval compile mydecode   lt string gt    exec    Hello World    So if you have 30 lines of code you ll probably want to encrypt it doing something like this    gt  gt  gt  f   open  myscript py    gt  gt  gt  encoded   base64 b64encode f read      You d then need to write a second script that does the compile   and eval   which would probably include the encoded script as a string literal encased in triple quotes   So it would look something like this   import base64 myscript      IyBUaGlzIGlzIGEgc2FtcGxlIFB5d               GhvbiBzY3JpcHQKcHJpbnQgIkhlbG               xvIiwKcHJpbnQgIldvcmxkISIK    eval compile base64 b64decode myscript    lt string gt    exec

User · Answer

Check out these tools for obfuscation and minification of python code    pyarmor  https   pypi org project pyarmor  - full obfuscation with hex-encoding  apparently doesn t allow partial obfuscation of variable function names only python-minifier  https   pypi org project python-minifier  - minifies the code and obfuscates function variable names  although not as intensely as pyminifier below  pyminifier  https   pypi org project pyminifier  - does a good job in obfuscating names of functions  variables  literals  can also perform hex-encoding  compression  similar as pyarmor  Problem  after obfuscation the code may contain syntax errors and not run    Example  py output from pyminifier when run with --obfuscate and --gzip     pyminifier --obfuscate --gzip  tmp tumult py     usr bin env python3 import zlib  base64 exec zlib decompress base64 b64decode  eJx1kcFOwzAMhu95ClMO66apu0 KAQEbE5eJC IUpa27haVJ5Ljb vakLYJx4JAoiT 7  3c3626SKvSuBW6M4Sej96Jq9y1wRM E3kSexnIOBZObrSNKI7Sl59YsWDq1wLMiEKNrenoYCqB1woDwzXF9nn2rskZd1jDh 9mhOD8DVvAQ8WdtrZfwg74aNwp7ZpnMXHUaltk878ybR ZNKbSjP8JPWk6wdn72ntodQ8lQucIrdGlxaHgq3QgKqtjhCY zlN6jQ0oZZxhpfKItlkuNB3icrE4XYbDwEBICRP6NjG1rri3YyzK356CtsGwZuNd o0kYitvrBd18qgmj3kcwoTckYPtJPAyCVzSKPCMNErs85 rMINdp1tUSspMqVYbp1Q2DWKTJpcGURRDr9DIJs8wJFlKq qzZRaQ4lAnVRuJgjFynj36Ol7SX iQXr8ANfezCw         Created by pyminifier py  https   github com liftoff pyminifier    This output corresponds to a 40-line original input script as shown here

User · Answer

Cython  It seems that the goto answer for this is Cython   I m really surprised no one else mentioned this yet   Here s the home page  https   cython org  In a nutshell  this transforms your python into C and compiles it   thus making it as well protected as any  normal  compiled distributable C program     There are limitations though   I haven t explored them in depth myself  because as I started to read about them  I dropped the idea for my own purposes   But it might still work for yours   Essentially  you can t use Python to the fullest  with the dynamic awesomeness it offers   One major issue that jumped out at me  was that keyword parameters are not usable    You must write function calls using positional parameters only   I didn t confirm this  but I doubt you can use conditional imports  or evals   I m not sure how polymorphism is handled     Anyway  if you aren t trying to obfuscate a huge code base after the fact  or ideally if you have the use of Cython in mind to begin with  this is a very notable option

User · Answer

This is only a limited  first-level obfuscation solution  but it is built-in  Python has a compiler to byte-code   python -OO -m py compile  lt your program py gt    produces a  pyo file that contains byte-code  and where docstrings are removed  etc   You can rename the  pyo file with a  py extension  and python  lt your program py gt  runs like your program but does not contain your source code   PS  the  limited  level of obfuscation that you get is such that one can recover the code  with some of the variable names  but without comments and docstrings   See the first comment  for how to do it  However  in some cases  this level of obfuscation might be deemed sufficient   PPS  If your program imports modules obfuscated like this  then you need to rename them with a  pyc suffix instead  I m not sure this won t break one day   or you can work with the  pyo and run them with python -O     pyo  the imports should work    This will allow Python to find your modules  otherwise  Python looks for  py modules

User · Answer

I recently stumbled across this blogpost: Python Source Obfuscation using ASTs where the author talks about python source file obfuscation using the builtin AST module. The compiled binary was to be used for the HitB CTF and as such had strict obfuscation requirements.

Since you gain access to individual AST nodes, using this approach allows you to perform arbitrary modifications to the source file. Depending on what transformations you carry out, resulting binary might/might not behave exactly as the non-obfuscated source.

User · Answer

As other answers have stated  there really just isn t a way that s any good  Base64 can be decoded  Bytecode can be decompiled  Python was initially just interpreted  and most interpreted languages try to speed up machine interpretation more than make it difficult for human interpretation   Python was made to be readable and shareable  not obfuscated  The language decisions about how code has to be formatted were to promote readability across different authors   Obfuscating python code just doesn t really mesh with the language  Re-evaluate your reasons for obfuscating the code

User · Answer

There are 2 ways to obfuscate python scripts

Obfuscate byte code of each code object
Obfuscate whole code object of python module

Obfuscate Python Scripts

Compile python source file to code object

char * filename = "xxx.py";
char * source = read_file( filename );
PyObject *co = Py_CompileString( source, filename, Py_file_input );

Iterate code object, wrap bytecode of each code object as the following format

0   JUMP_ABSOLUTE            n = 3 + len(bytecode)    
3
...
... Here it's obfuscated bytecode
...

n   LOAD_GLOBAL              ? (__armor__)
n+3 CALL_FUNCTION            0
n+6 POP_TOP
n+7 JUMP_ABSOLUTE            0

Serialize code object and obfuscate it

char *original_code = marshal.dumps( co );
char *obfuscated_code = obfuscate_algorithm( original_code  );

Create wrapper script "xxx.py", ${obfuscated_code} stands for string constant generated in previous step.
```
__pyarmor__(__name__, b'${obfuscated_code}')
```

Run or Import Obfuscated Python Scripts

When import or run this wrapper script, the first statement is to call a CFunction:

int __pyarmor__(char *name, unsigned char *obfuscated_code) 
{
  char *original_code = resotre_obfuscated_code( obfuscated_code );
  PyObject *co = marshal.loads( original_code );
  PyObject *mod = PyImport_ExecCodeModule( name, co );
}

This function accepts 2 parameters: module name and obfuscated code, then

Restore obfuscated code
Create a code object by original code
Import original module (this will result in a duplicated frame in Traceback)

Run or Import Obfuscated Bytecode

After module imported, when any code object in this module is called first time, from the wrapped bytecode descripted in above section, we know

First op JUMP_ABSOLUTE jumps to offset n
At offset n, the instruction is to call a PyCFunction. This function will restore those obfuscated bytecode between offset 3 and n, and place the original bytecode at offset 0
After function call, the last instruction jumps back to offset 0. The real bytecode is now executed

Refer to Pyarmor

User · Answer

You could embed your code in C C   and compile Embedding Python in Another Application embedded c  include  lt Python h gt   int main int argc  char  argv        Py SetProgramName argv 0        optional but recommended      Py Initialize      PyRun SimpleString  quot print  Hello world     quot      Py Finalize      return 0     In Ubuntu Debian   sudo apt-get install python-dev  In Centos Redhat Fedora   sudo yum install python-devel  compile with   gcc -o embedded -fPIC -I usr include python2 7 -lpython2 7 embedded c  run with   chmod u x   embedded   time   embedded Hello world    real  0m0 014s user  0m0 008s sys 0m0 004s  initial script  hello world py  print  Hello World      run the script   time python hello world py Hello World    real  0m0 014s user  0m0 008s sys 0m0 004s  however some strings of the python code may be found in the compiled file   grep  quot Hello quot    embedded Binary file   embedded matches    grep  quot Hello World quot    embedded    In case you want an extra bit of obfuscation you could use base64     PyRun SimpleString  quot import base64 n quot                     quot base64 code    your python code in base64  n quot                     quot code   base64 b64decode base64 code  n quot                     quot exec code  quot         e g  create the base 64 string of your code   base64 hello world py cHJpbnQoJ0hlbGxvIFdvcmxkICEnKQoK  embedded base64 c  include  lt Python h gt   int main int argc  char  argv        Py SetProgramName argv 0        optional but recommended      Py Initialize      PyRun SimpleString  quot import base64 n quot                       quot base64 code    cHJpbnQoJ0hlbGxvIFdvcmxkICEnKQoK  n quot                       quot code   base64 b64decode base64 code  n quot                       quot exec code  n quot      Py Finalize      return 0     all commands   gcc -o embedded base64 -fPIC -I usr include python2 7 -lpython2 7   embedded base64 c   chmod u x   embedded base64    time   embedded base64 Hello World    real  0m0 014s user  0m0 008s sys 0m0 004s    grep  quot Hello quot    embedded base64    update  this project  pyarmor  might also help  https   pypi org project pyarmor

User · Answer

The best way to do this is to first generate a .c file, and then compile it with tcc to a .pyd file
Note: Windows-only

Requirements

tcc

pyobfuscate

Cython

Install:

sudo pip install -U cython

To obfuscate your .py file:

pyobfuscate.py myfile.py >obfuscated.py

To generate a .c file,

~~Add an init<filename>() function to your .py file~~ Optional
cython --embed file.py
cp Python.h tcc\include
tcc file.c -o file.pyd -shared -I\path\to\Python\include -L\path\to\Python\lib
import .pyd file into app.exe

User · Answer

so that it isn t human-readable  i mean all the file is encoded    when you open it you can t understand anything      that what i want  As maximum  you can compile your sources into bytecode and then distribute only bytecode  But even this is reversible  Bytecode can be decompiled into semi-readable sources  Base64 is trivial to decode for anyone  so it cannot serve as actual protection and will  hide  sources only from complete PC illiterates  Moreover  if you plan to actually run that code by any means  you would have to include decoder right into the script  or another script in your distribution  which would needed to be run by legitimate user   and that would immediately give away your encoding encryption  Obfuscation techniques usually involve comments docs stripping  name mangling  trash code insertion  and so on  so even if you decompile bytecode  you get not very readable sources  But they will be Python sources nevertheless and Python is not good at becoming unreadable mess  If you absolutely need to protect some functionality  I d suggest going with compiled languages  like C or C    compiling and distributing  so  dll  and then using Python bindings to protected code

User · Answer

Try this python obfuscator   pyob oxyry com pyob oxyry c    all       foo    a    a   b    b   def foo        print a   def bar        print  b   def  baz        print a    b   foo   bar    baz     will translated to    all      foo   line 1 OO00OO0OO0O00O0OO   a  line 3  O00OO0000OO0O0O0O   b  line 4 def foo     line 6     print  OO00OO0OO0O00O0OO   line 7 def O0000000OOOO00OO0     line 9     print   O00OO0000OO0O0O0O   line 10 def  OOO00000O000O0OOO     line 12     print  OO00OO0OO0O00O0OO   O00OO0000OO0O0O0O   line 13 foo    line 15 O0000000OOOO00OO0    line 16  OOO00000O000O0OOO    line 17

User · Answer

Well if you want to make a semi-obfuscated code you make code like this:

import base64
import zlib
def run(code): exec(zlib.decompress(base64.b16decode(code)))
def enc(code): return base64.b16encode(zlib.compress(code))

and make a file like this (using the above code):

f = open('something.py','w')
f.write("code=" + enc("""
print("test program")
print(raw_input("> "))"""))
f.close()

file "something.py":

code = '789CE352008282A2CCBC120DA592D4E212203B3FBD28315749930B215394581E9F9957500A5463A7A0A4A90900ADFB0FF9'

just import "something.py" and run run(something.code) to run the code in the file.

One trick is to make the code hard to read by design: never document anything, if you must, just give the output of a function, not how it works. Make variable names very broad, movie references, or opposites example: btmnsfavclr = 16777215 where as "btmnsfavclr" means "Batman's Favorite Color" and the value is 16777215 or the decimal form of "ffffff" or white. Remember to mix different styles of naming to keep those pesky people of of your code. Also, use tips on this site: Top 11 Tips to Develop Unmaintainable Code.

User · Answer

Maybe you should look into using something simple like a truecrypt volume for source code storage as that seems to be a concern of yours. You can create an encrypted file on a usb key or just encrypt the whole volume (provided the code will fit) so you can simply take the key with you at the end of the day.

To compile, you could then use something like PyInstaller or py2exe in order to create a stand-alone executable. If you really wanted to go the extra mile, look into a packer or compression utility in order to add more obfuscation. If none of these are an option, you could at least compile the script into bytecode so it isn't immediately readable. Keep in mind that these methods will merely slow someone trying to debug or decompile your program.

[python] How to obfuscate Python code effectively?

The answer is

Cleartext:

Encoded: