[javascript] Executing Javascript from Python

I have HTML webpages that I am crawling using xpath. The etree.tostring of a certain node gives me this string:

<script>
<!--
function escramble_758(){
  var a,b,c
  a='+1 '
  b='84-'
  a+='425-'
  b+='7450'
  c='9'
  document.write(a+c+b)
}
escramble_758()
//-->
</script>

I just need the output of escramble_758(). I can write a regex to figure out the whole thing, but I want my code to remain tidy. What is the best alternative?

I am zipping through the following libraries, but I didnt see an exact solution. Most of them are trying to emulate browser, making things snail slow.

Edit: An example will be great.. (barebones will do)

This question is related to javascript python screen-scraping

The answer is


You can use js2py context to execute your js code and get output from document.write with mock document object:

import js2py

js = """
var output;
document = {
    write: function(value){
        output = value;
    }
}
""" + your_script

context = js2py.EvalJs()
context.execute(js)
print(context.output)

You can also use Js2Py which is written in pure python and is able to both execute and translate javascript to python. Supports virtually whole JavaScript even labels, getters, setters and other rarely used features.

import js2py

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
}
escramble_758()
""".replace("document.write", "return ")

result = js2py.eval_js(js)  # executing JavaScript and converting the result to python string 

Advantages of Js2Py include portability and extremely easy integration with python (since basically JavaScript is being translated to python).

To install:

pip install js2py

You can use requests-html which will download and use chromium underneath.

from requests_html import HTML

html = HTML(html="<a href='http://www.example.com/'>")

script = """
function escramble_758(){
    var a,b,c
    a='+1 '
    b='84-'
    a+='425-'
    b+='7450'
    c='9'
    return a+c+b;
}
"""

val = html.render(script=script, reload=False)
print(val)
# +1 425-984-7450

More on this read here


quickjs should be the best option after quickjs come out. Just pip install quickjs and you are ready to go.

modify based on the example on README.

from quickjs import Function

js = """
function escramble_758(){
var a,b,c
a='+1 '
b='84-'
a+='425-'
b+='7450'
c='9'
document.write(a+c+b)
escramble_758()
}
"""

escramble_758 = Function('escramble_758', js.replace("document.write", "return "))

print(escramble_758())

https://github.com/PetterS/quickjs


One more solution as PyV8 seems to be unmaintained and dependent on the old version of libv8.

PyMiniRacer It's a wrapper around the v8 engine and it works with the new version and is actively maintained.

pip install py-mini-racer

from py_mini_racer import py_mini_racer
ctx = py_mini_racer.MiniRacer()
ctx.eval("""
function escramble_758(){
    var a,b,c
    a='+1 '
    b='84-'
    a+='425-'
    b+='7450'
    c='9'
    return a+c+b;
}
""")
ctx.call("escramble_758")

And yes, you have to replace document.write with return as others suggested


Examples related to javascript

need to add a class to an element How to make a variable accessible outside a function? Hide Signs that Meteor.js was Used How to create a showdown.js markdown extension Please help me convert this script to a simple image slider Highlight Anchor Links when user manually scrolls? Summing radio input values How to execute an action before close metro app WinJS javascript, for loop defines a dynamic variable name Getting all files in directory with ajax

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to screen-scraping

What's the best way of scraping data from a website? Executing Javascript from Python Can scrapy be used to scrape dynamic content from websites that are using AJAX? How I can get web page's content and save it into the string variable How do I prevent site scraping? Web scraping with Python