{"id":18,"date":"2011-08-19T16:01:30","date_gmt":"2011-08-19T19:01:30","guid":{"rendered":"http:\/\/mdqinc.com\/blog\/?p=18"},"modified":"2012-11-29T16:36:25","modified_gmt":"2012-11-29T19:36:25","slug":"statically-linking-python-with-cython-generated-modules-and-packages","status":"publish","type":"post","link":"https:\/\/mdqinc.com\/blog\/2011\/08\/statically-linking-python-with-cython-generated-modules-and-packages\/","title":{"rendered":"Statically linking Python with Cython generated modules and packages!"},"content":{"rendered":"<p>My latest adventures with <a href=\"http:\/\/mdqinc.com\/blog\/2011\/07\/automatic-sdl-bindings-for-python\/\" target=\"_blank\">SDL bindings<\/a> eventually led me to <a href=\"http:\/\/www.cython.org\/\" target=\"_blank\">Cython<\/a>, a very recommendable tool if you are looking to extract a little bit more juice out of your Python app performance, or just hide your source a little bit more obscurely. It compiles almost any Python code as is, and it includes extensions to the language that allow even faster converted code with the inclusion of static typed variables and other niceties. Cython offers a way to automatically compile your .py\/.pyx modules, and you load those dynamically with the familiar import command, the usage of the imported module is exactly the same as if it were a native .py module or a compiled c module.<\/p>\n<p>At this point, it\u2019s important to mention that the generated .c files and their corresponding linked versions depend on the Python runtime, you can\u2019t make a standalone executable out of them\u2026are least not easily. Now, let\u2019s suppose you didn\u2019t read the \u201cnot easily\u201d part I just mentioned, and that you wanted to integrate this module (or any other module you made in C from scratch) in a statically linked Python interpreter, how you\u2019d go about it?<\/p>\n<p>The following instructions were tested under Ubuntu Natty 64 bits. First, start by downloading the <a href=\"http:\/\/python.org\/download\/\" target=\"_blank\">Python source<\/a>. Extract,<strong> copy Modules\/Setup.dist to Modules\/Setup<\/strong> and run configure with the following parameters:<\/p>\n<p><strong>.\/configure LDFLAGS=\u201d-Wl,\u2013no-export-dynamic -static-libgcc -static\u201d CPPFLAGS=\u201d-static -fPIC\u201d LINKFORSHARED=\u201d \u201d DYNLOADFILE=\u201ddynload_stub.o\u201d \u2013disable-shared \u2013prefix=\u201d\/path\/to\/where\/you\/want\/it\/installed\u201d<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>Followed by the all too familiar make &amp; make install<\/p>\n<p>You will see A LOT of errors that you can \u00bfsafely? ignore mostly related to the fact that the c modules that come from Python won\u2019t compile in static mode without some help. Once this crazyness stops, you\u2019ll have a static Python interpreter (you can check with ldd .\/python to see that it\u2019s actually a standalone executable).<\/p>\n<p>Now, this Python interpreter is lacking severely in content, and no one wants to re invent the wheel, specially such a fine wheel as Python provides\u2026Go to that Modules\/Setup file and take a look\u2026search for the #*shared* line, remove it and replace it by *static* (with no # sign)\u2026now look for some notable modules and uncomment them. Run the process again (configure and make) and this time you\u2019ll end up with some builtin modules that you can import.<\/p>\n<p>By now, you are probably catching my drift\u2026 lets suppose you have a module test.py, run \u201ccython test.py\u201d on it, and you\u2019ll get a test.c file\u2026copy it to Modules under the Python source, and edit Modules\/Setup adding a line:<\/p>\n<p>&nbsp;<\/p>\n<pre>test test.c<\/pre>\n<p>Do the configure and make dance again, and now you should be able to do \u201cimport test\u201d in the new Python interpreter, which will load the module as builtin. Neat, right?<\/p>\n<p>If you go further down the rabbit hole and start depending on 3rd party libraries (or your own!), you will need to pay attention to how dependencies are specified in Modules\/Setup. In short, you put whatever compiler and linker directives you need after the source files for the module.<\/p>\n<p>This is all fine and dandy, but we haven\u2019t broken anything yet\u2026Let\u2019s try something more advanced\u2026imagine you have a full Python package already made (as in a full hierachy of modules arranged in folders and subfolders, etc), and you want to do the same Cython fueled embedding with it\u2026After hitting your head on the wall for a looong while, you\u2019ll figure out that actually you can\u2019t (easily) do it\u2026Basically because the Python interpreter builtin system is not geared towards packages, but rather towards shallow modules.<\/p>\n<p>So, there\u2019s two ways around it (that I know of). The first one is to use a series of shallow modules, and string them into a package like structure by means of importing submodules from the parent modules\u2026<\/p>\n<pre class=\"brush:python\">main.py\r\nimport submod1 as _submod1\r\nsubmod1 = _submod1<\/pre>\n<p>This is boring, error prone, requires a lot of glue code, it doesn\u2019t play well with your module structure if you want it to also work in non compiled mode, etc.<\/p>\n<p>The alternative is hacking the Python code just a little bit. Namely, the Python\/import.c file, look for the find_module function and add:<\/p>\n<pre class=\"brush:c\"> if (is_builtin(fullname)) {\r\n         strcpy(buf, fullname);\r\n         return &amp;fd_builtin;\r\n     }<\/pre>\n<p>Place this code near the top of the function, right above the \u201cif (path != NULL &amp;&amp; PyString_Check(path)) {\u201d line seems like a good place. What it does is to check the full module name (package1.package2.module) and sees if it is builtin. The official Python code doesn\u2019t do this, it checks only for the module name for the reasons stated above.<\/p>\n<p>Besides this little patch, you have to alter the Cython generated code just a bit\u2026look for the \u201cPy_InitModule4\u2033 line, and replace the module name for the whole package name (if the module is package1.package2.module that line will only say \u201cmodule\u201d, you need to replace it by the whole enchilada). Doing this by hand is a PITA, but a simple find+sed command takes care of it swiftly. Also, while you are unleashing your sed kung fu, take care of the init??? functions, if a module is at package1.package2.mymodule, replace initmymodule by initpackage1_package2_mymodule (the reason why you have to do this will become clear later\u2026or maybe not and I\u2019m just making this stuff up)<\/p>\n<p>Now, you have to go back to the Modules\/Setup and edit the module line you added by appending all your sources (seems like a good job for a Python script, right?). If you run configure and make at this point you\u2019ll then see that\u2026it doesn\u2019t quite work. Why? Because Python depends on a __path__ variable to figure out which module is a package and which one is just a module. Yes, you need to add those\u2026<\/p>\n<p>This is simple enough, in every package __init__.py file, add a __path__=[&#8216;package1\/package2\/&#8230;&#8221;,] line with the right path for the location of the file.<\/p>\n<p>And finally, you are ready\u2026well, not yet. There are two things more you need to do\u2026first, as the Python build system is geared towards shallow packages, you\u2019ll have a problem if files in different subpackages have the same name, as they\u2019ll end up overwriting each other when they are compiled (this will certainly happen for the __init__.py files), so you have to figure out a way to flatten your structure before adding them to Modules\/Setup. What I do is scan the whole structure and copy the *.c files to a separate folder, replacing the \u2018\/\u2019 of the directory separator by a \u2018+\u2019 sign. This way package1\/package2\/module.c becomes package1+package2+module.c. Then, add all this files to the same line in Modules\/Setup, and then it comes the final piece of glue:<\/p>\n<p>If your overall package is called\u2026let\u2019s say \u201ctest\u201d to be creative, create a test.c file with something like this:<\/p>\n<p>&nbsp;<\/p>\n<pre class=\"brush:c\">#include \"Python.h\"\r\nstatic PyMethodDef nomethods[] = {  {NULL, NULL}};\r\nextern void inittest_module1();\r\nextern void inittest_package1();\r\nextern void inittest_package1_submodule();\r\n\r\nPyMODINIT_FUNC\r\ninittest(){\r\n    PyObject* module;\r\n    PyObject* __path__;\r\n\r\n    \/\/ Add a __path__ attribute so Python knows that this is a package\r\n    PyObject* package_gilbert = PyImport_AddModule(\"test\");\r\n    Py_InitModule(\"test\", nomethods);\r\n    __path__ = PyList_New(1);\r\n    PyList_SetItem(__path__, 0, PyString_FromString(\"test\"));\r\n    PyModule_AddObject(package_test, \"__path__\", __path__);\r\n\r\n    PyImport_AppendInittab(\"test.package1\", inittest_package1);\r\n    PyImport_AppendInittab(\"test.package1.submodule\", inittest_package1_submodule);\r\n    }<\/pre>\n<p>Append this file also to the Modules\/Setup line. What this file does is create a \u201ctest\u201d package, set up the __path__ variable accordingly, and append to the Python internal builtin table all of our modules. Now the reason for renaming the init functions earlier should become clear (just nod even if you go lost at Statically linking Python\u2026)<\/p>\n<p>Finally, run configure and make for the last time and your builtin package should be there\u2026or not, there\u2019s literally a hundred places where things could go wrong and the online documentation on the subject is quite sparse, that\u2019s why I\u2019m leaving this here for those brave souls that wish to try it. If something or everything in the process is not clear enough, let me know in the comments and good luck! (You\u2019ll definitively will need it).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>My latest adventures with SDL bindings eventually led me to Cython, a very recommendable tool if you are looking to extract a little bit more juice out of your Python app performance, or just hide your source a little bit more obscurely. It compiles almost any Python code as is, and it includes extensions to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[3,4],"tags":[],"_links":{"self":[{"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/posts\/18"}],"collection":[{"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/comments?post=18"}],"version-history":[{"count":5,"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/posts\/18\/revisions"}],"predecessor-version":[{"id":50,"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/posts\/18\/revisions\/50"}],"wp:attachment":[{"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/media?parent=18"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/categories?post=18"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mdqinc.com\/blog\/wp-json\/wp\/v2\/tags?post=18"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}