Interesting results running Tactic with pypy

3 posts / 0 new
Last post
Diego
Diego's picture
Interesting results running Tactic with pypy

Hi all,

I did some quick tests with tactic + pypy and got some interesting results I'd like to share and discuss.

I made a couple of trivial changes to make tactic work on pypy (see later) Remko would you like/accept a PR for these?

 

I have a tactic SType with 84773 SObjects stored (not reccomended :-)). Here are some query results with tactic running on a test KVM virtual machine with CentOS 7.3.1611, postgresql 9.2.18  and Python 2.7.13:

_ = server.eval("@SOBJECT(paf/sp_shader)")

WARNING query: (84773) sobjects found: SELECT "paf"."public"."sp_shader".* FROM "paf"."public"."sp_shader" WHERE ("sp_shader"."s_status" != 'retired' or "sp_shader"."s_status" is NULL) ORDER BY "sp_shader"."code"
Duration: 18.255 seconds (request_id: 139676295886592 - #0000001)
Duration: 19.154 seconds (request_id: 139676287493888 - #0000002)
Duration: 19.289 seconds (request_id: 139676279101184 - #0000003)
Duration: 19.325 seconds (request_id: 139676270708480 - #0000004)

 

It usually takes 19 secs to finish the query.

 

Now, on the same machine running tactic with pypy 5.8

Duration: 7.792 seconds (request_id: 140002897942272 - #0000001)
Duration: 5.623 seconds (request_id: 140002889549568 - #0000002)
Duration: 6.494 seconds (request_id: 140002881156864 - #0000003)
Duration: 4.902 seconds (request_id: 140002872764160 - #0000004)
Duration: 4.733 seconds (request_id: 140002864371456 - #0000005)
Duration: 6.765 seconds (request_id: 140002377856768 - #0000006)
 
Not bad! And the memory footprint is smaller too! These benchmark are obviously very small and in the real world speed ups may vary but its really promising and easy to run tactic with pypy
 
In case anyone wants to do some more experiments here it is a small howto:
 
- I assume you have a tactic TEST server already setup (I use tactic 4.6 synced with github but 4.5 should work too) DON'T TRY THIS ON YOUR PRODUCTION SERVER!
- extract it to /opt/pypy-5.8
- As tactic user create a pypy virtualenv "/opt/pypy-5.8/bin/virtualenv-pypy /home/apache/pypy-venv"
- As tactic user activate the virtual env ". /home/apache/pypy-venv/bin/activate"
- now install the needed modules: "pip install lxml PyCrypto psycopg2cffi pillow"
- Do some small changes to make tactic compatible with pypy:
**************************************************
--- a/src/pyasm/search/database_impl.py
+++ b/src/pyasm/search/database_impl.py
@@ -572,7 +572,8 @@ class DatabaseImpl(DatabaseImplInterface):
         columns = []
         for description in sql.description:
             # convert to unicode
-            value = unicode(description[0], 'utf-8')
+            # value = unicode(description[0], 'utf-8')
+            value = description[0].decode('utf-8')
             columns.append(value)
 
         return columns
**************************************************
and:
**************************************************
--- a/src/pyasm/search/sql.py
+++ b/src/pyasm/search/sql.py
@@ -32,7 +32,11 @@ except ImportError, e:
     pass
 
 try:
-    import psycopg2
+    try:
+        import psycopg2
+    except ImportError, e:
+        from psycopg2cffi import compat
+        compat.register()
     # set to return only unicode
     import psycopg2.extensions
     psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
**************************************************
- now (as the tactic user and with the new virtualenv active) you can run tactic in dev mode: "pypy /home/apache/tactic/src/bin/startup_dev.py" or normal mode: "pypy /home/apache/tactic/src/bin/monitor.py"
remko

This is quite interesting.  I has spent a lot of time trying to make the conversion from a double array returned from the database into sobjects as fast as possible.  It was a little frustrating as I believe I was nearing the limits of Python performance.  I guess the JIT really does make a difference.  In real world performance, it may have other speed ups, especially when drawing the interface.

Your changes are pretty trivial, so yes, you should do a pull request.  I am not 100% sure what the difference between these two lines are, no matter how many times I have read the documentation of python unicode over the years:

-            value = unicode(description[0], 'utf-8')
+            value = description[0].decode('utf-8')

Likely, this unicode function was added to resolve the ascii errors that seem to always pop up.  This one pertained to retriving column names that were unicode ... especially in langauges such as Chinese and Japanese.  We need to make sure that would still work.

Diego
Diego's picture

The SType editor has a check in place that show an error message when non english (it should say latin) characters are used
I tested using chinese characters for columns name anyway and it works ok with python, hower it brakes with pypy I thinks this is because psycopg2cffi returns unicode by default. this should not be a big problem so I submitted the PR. More over the unicode() type is going away with python 3.

Diego