- Buy Microsoft Visio Professional or Microsoft Project Professional 2024 for just $80
- Get Microsoft Office Pro and Windows 11 Pro for 87% off with this bundle
- Buy or gift a Babbel subscription for 78% off to learn a new language - new low price
- Join BJ's Wholesale Club for just $20 right now to save on holiday shopping
- This $28 'magic arm' makes taking pictures so much easier (and it's only $20 for Black Friday)
Securing Python Code with Cython
Because of the nature of Python (interpreted language), securing the source code is a challenging task. In order to execute the source code, it must be available in some form.
Throughout this article, I’ll detail the compiling modules with Cython method/solution to the challenge of protecting a Python-based codebase.
Cython is a static compiler for Python and Cython programming languages, it simplifies the job of writing Python C extensions. Cython allows us to compile Python code, the result is dynamic libraries that can be used as python modules too.
The Cython import process is as follows:
- shared library (.so, .pyd)
- python bytecode (.pyo, .pyc)
- python file (.py)
So… what are the benefits of using Cython compiled modules?
- Binary modules will impose a much harder task to get the original Python code, reverse engineering techniques must be used to do so.
- Cython generated C code can be modified to introduce changes, improve protection, etc.
- GCC optimization flags can be used while compiling the library
- Tracebacks won’t reveal code, but just line numbers (unless disabled ).
- Cython takes Python code and translates it to C, which is then compiled by GCC (or similar), the compiled code will run faster than the pure Python version.
Let’s review the basic functionality of Cython
Remember the hello.py script from the HashiCorp Vault Secret Manager article? Well, pulling secrets from HashiCorp Vault is great but If you think about it… if the user can access/modify the code, he/she can add a simple print statement to reveal the secrets (check lines #19 – #21)
import getpass import hvac VAULT_ADDR = 'http://127.0.0.1:8200' VAULT_TOKEN = getpass.getpass('Hashicorp Vault Token ID: ') client = hvac.Client() client = hvac.Client( url = VAULT_ADDR, token = VAULT_TOKEN ) response = client.secrets.kv.read_secret_version(path='ap') client_id = response['data']['data']['client_id'] client_secret = response['data']['data']['client_secret'] repo_token = response['data']['data']['repo_token'] print("Client ID: " + client_id) print("Client Secret: " + client_secret) print("Repo Token: " + repo_token)
hmmm… We need to prevent others from modifying the file… let’s see how Cython can help with that.
1. For the sake of this POC, let’s leave the three print statements (lines #19 – #21). Preferably, these lines should be removed 😉
2. Make sure to have the “python3-devel” package installed (e.g., sudo yum install python3-devel)
3. Install Cython- sudo pip3 install Cython
$ sudo pip3 install Cython Collecting Cython Downloading https://files.pythonhosted.org/packages/40/67/36322cf0387cf65e6be80ba2d9a33db227ecbc624902f0cb2e4bf456261f/Cython-0.29.23-cp38-cp38-manylinux1_x86_64.whl (1.9MB) |████████████████████████████████| 1.9MB 23.3MB/s Installing collected packages: Cython Successfully installed Cython-0.29.23
4. Convert the python code into C code – cython hello.py –embed (note: add –embed flag to create a standalone program. If –embed is not used the c code will not have a main as it will mean to create a shared object rather than a standalone executable. After the following command is issued and executed, a c source file hello.c should be created in the same directory)
$ cython hello.py -o cython.c /usr/local/lib64/python3.8/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /home/ec2-user/hello.py tree = Parsing.p_module(s, pxd, full_module_name)
5. Compile the c code into an executable – gcc `python3-config –cflags –ldflags` hello.c -o hello (note: the include and library paths python must be specified. The execution of the following command should create an executable file hello. this will be a distributable binary)
$ gcc `python3-config --cflags --ldflags` hello.c -o hello $ [NO OUTPUT]
6. Check the folder content – ls -rtl
$ ls -rtl total 276 -rw-rw-r--. 1 ec2-user ec2-user 545 Jul 11 16:06 hello.py -rw-rw-r--. 1 ec2-user ec2-user 139572 Jul 11 17:27 hello.c -rwxrwxr-x. 1 ec2-user ec2-user 132312 Jul 11 17:29 hello
7. Run the hello script – ./hello (when asked, enter the “Root Token” from HashiCorp Vault Secret Manager article, step #4)
$ ./hello Hashicorp Vault Token ID: [ --> Root Token: s.4Gl4TLJb1D82OWxxxxxxxxxx] Client ID: 123456789 Client Secret: 987654321 Repo Token: a1b2c3d4e5
8. View the hello file content – cat hello (note: file output was truncated)
$ cat hello ELF>?J@@?@8 @'&@@@@@h??@?@@@HUHU 0]0]`0]`? ?HX[cBE??j??@?@ Cֻ?|??V?T?????@?@ P?td`P`P@`P@??Q?tdR?td0]0]`0]`??/lib64/ld-linux-x86-64.so.2GNU?GNUGNU?M?;>P??¸ܿ???ȡX?d! :? @h`(?E@F @b`5`L@??F@<J?J@/?h`Q?K@ea@L@libpython3.6m.so.1.0_ITM_deregisterTMCloneTable__gmon_start___ITM_registerTMCloneTablelibpthread.so.0libdl.so.2libutil.so.1libm.so.6_PyThreadState_UncheckedGetPyFrame_NewPyEval_EvalFrameExPyObject_GetAttrPyObject_CallPyThreadState_Get_Py_CheckRecursionLimit_Py_CheckRecursiveCallPyErr_OccurredPyExc_SystemErrorPyErr_SetStringPyObject_GetAttrString_Py_NoneStructPyDict_SetItemStringPyExc_AttributeErrorPyErr_ExceptionMatchesPyErr_ClearPyExc_ImportErrorPyModule_NewObjectPyModule_GetDictPyDict_GetItemWithErrorPyTuple_PackPyExc_KeyErrorPyErr_SetObjectPyExc_NameErrorPyErr_Format_PyDict_GetItem_KnownHashPyList_NewPyDict_NewPyImport_ImportModuleLevelObjectPyExc_RuntimeErrorPyOS_snprintfPy_GetVersionPyErr_WarnExPyFrame_TypePyTuple_NewPyBytes_FromStringAndSizePyUnicode_FromStringAndSizePyImport_AddModulePyObject_SetAttrStringPyUnicode_InternFromStringPyUnicode_DecodePyObject_HashPyObject_SetAttrPyImport_GetModuleDictPyDict_GetItemStringPyDict_SetItem_PyObject_GetDictPtrPyObject_Not_Py_FalseStruct_Py_TrueStructPyUnicode_FromStringPyFunction_TypePyEval_EvalCodeExPyCFunction_TypePyDict_TypePyObject_GetItemPyNumber_AddPyUnicode_FromFormatPyCode_NewPyMem_MallocPyMem_ReallocPyTraceBack_HerePyModuleDef_InitPyModule_TypePyType_IsSubtypePyModule_ExecDefPyErr_PrintPy_FinalizeExPyMem_RawFreePy_InitializePy_SetProgramNamePySys_SetArgvlibc.so.6setlocalembrtowcmbstowcs__stack_chk_failstrdupstrlenmallocstderrfwrite__libc_start_mainfree_edata__bss_start_end__pyx_module_is_main_helloPyInit_hello_IO_stdin_used__data_start__libc_csu_init__libc_csu_finiquiBC_2.{`_`BC_2h_`5?ii p_` x_`?_`?_`?_`!?_`#?_`)?_`8?_`;?_`@?_`A?_`G?_`J?_`K?_`N?_`P?_`R?_`S`` ``(``0``8``@``H``P`X`` h``p``x``?``?``?``?``?``?``?``?````?``?``?``?`` ?``"?``$?``%a`a`'a`(a`* a`+(a`,0a`-8a`.@a`/Ha`0Pa`1Xa`2`a`3ha`4pa`5xa`6?a`7?a`9?a`:?a`<?a`=?a`>?a`??a`B?a`C?a`D?a`E?a`F?a`H?a`I?a`L?a`Mb`b`Qb`Tb`U b`V(b`W??H?H?AC H??t??H???5BC ??%CC ??h?????????h?????????h?????????h????????h????????h????????h????????h??q????????a??????h ??Q??????h ??A??????h ??1??????h ????????h????????h?????????h?????????h?????????h?????????h????????h????????h????????h????????h??q??????h??a??????h??Q??????h??A??????h?1??????h??!??????h????????h????????h?????????h ?????????h!?????????h"?????????h#????????h$????????h%????????h&????????h'??q??????h(??a??????h)??Q??????h*??A??????h+??1??????h,??!??????h-????????h.????????h/?????????h0?????????h1?????????h2?????????h3????????h4????????h5????????h6????????h7??q????? D????%? D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?>> D????%> D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?== D????%= D????%?< DA????A??A??xIc?H??D9|D1?1?A?D9?}%D??)șA???Hc?H??D9}?H??A????Hc?H??D9}???AVI??AUI??ATI??USH??????H??1?L??H??H???????H??H??tGH?D 1?H?L9?}I?T?H?H??H????1?H???????E H? I??u []AA]A^????H??H??I???O???L?%?9 ?H ???H A;AVAUATUSH?L???M??u $3H??H??L??A??H???&????p ?V??P A?$?H?=???? @?H?=?%?{?????t?1??59?} ??????@$H??u#?|???H??H??u?H??8 H?5?%H?8????H??[]AA]A^?AVE??AUI??ATI??H??US?y???H??t5H;]8 H??1?A??tH??L??L????????H? u)H?H???P0?H?08 ???H?8?]?????t?????1?[??]AA]A^???AUI??ATUSQ?$???H?PH??@ H??u H??@ ?"H9?tH?'8 H?%1?H?8?????H?-?B H??t H?E??H?5?%L??????I??H????H???A???I? $H??u w%L??L??H?s%?I?????xH???H? I?DL???P0H????H??????I??H????A?H? u H?H???P0ZH??[]AA]?USH??Q?-???H??H??ub?`???H??u[H?H?????t7?1??o???H??H??t7H??H?E6 H?8?e???H?
Final thoughts
This article attempts to find a solution to the problem. Cython seems like a promising option to consider. It is true that any user will have access to binaries that can be used to reverse engineer the application, but that’s going to take a good amount of time and work.
Disclaimers
- This article aims to cover the basic functionality of Cython.
- It’s also possible to combine the different approaches to provide an even more secure environment.
- Want to learn more about Cython? Please contact the Cross-Domain TAB team (mailto: xdc-amer-tab).
We’d love to hear what you think. Ask a question or leave a comment below.
And stay connected with Cisco DevNet on social!
LinkedIn | Twitter @CiscoDevNet | Facebook | Developer Video Channel
Share: