Skip to content Skip to sidebar Skip to footer

Python Udfs In Pig

I've seen the documentatio here, but I confess that I feel it rather lacking. I was wondering if anyone could give me collection of examples as to incorporating Python UDFs into P

Solution 1:

Python UDFs are quite limited. You cannot use Algebraic or Accumulator interfaces, nor can you write a LoadFunc in Python. For anything more complicated than a map operation you will likely need to resort to a Java UDF.

That said, a more complex Python UDF with a dynamic outputSchema can be found at http://ragrawal.wordpress.com/2013/02/24/on-writing-python-udf-for-pig-a-perspective/. This likely won't help you, but it will give you a better understanding of what Python UDFs can do.

Solution 2:

This may not answer most of your specific questions, but this blog post and linked code contains several good examples of using Pig with Python, and does include usage of Store/Load and their interaction with Python.

Post a Comment for "Python Udfs In Pig"