2011/06/22

hello, pig

This is my first time to touch pig which is one of hadoop-related project.

yaboo@maniac:~$ pig -x local

2011-06-22 23:45:32,781 [main] INFO org.apache.pig.Main - Logging error messages to: /home/yaboo/pig_1308753932779.log
2011-06-22 23:45:32,849 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///

grunt> A = load '/etc/passwd' using PigStorage(':'); # read /etc/password and sep is ':'

grunt> B = foreach A generate $0 as id; # get first column

grunt> dump B; # output result to stdout

2011-06-22 23:46:40,493 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2011-06-22 23:46:40,493 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
...
2011-06-22 23:46:47,276 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-06-22 23:46:47,277 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(root)
(daemon)
(bin)
(sys)
(sync)
(games)
(man)
(lp)
(mail)
(news)
(uucp)
(proxy)
(www-data)
(backup)
(list)
(irc)
(gnats)
(nobody)
(libuuid)
(syslog)
(messagebus)
(avahi-autoipd)
(avahi)
(couchdb)
(speech-dispatcher)
(usbmux)
(haldaemon)
(kernoops)
(pulse)
(rtkit)
(saned)
(hplip)
(gdm)
(yaboo)
(sshd)
(hadoop)

grunt>store B into 'id.out'; # stored B into id.out directory

No comments:

Post a Comment

100