Showing posts with label Python. Show all posts
Showing posts with label Python. Show all posts

Wednesday, January 11, 2017

Python BeautifulSoup Example

#!/usr/bin/python

from urllib import urlopen
from bs4 import BeautifulSoup

url = 'https://www.theice.com/marketdata/reports/icefutureseurope/BrentMarkers.shtml'

html = urlopen(url)

bsObj = BeautifulSoup(html.read())

# The markers are in a table, there are several tables on the page
# The Afternoon markers are in the table that follows a paragraph with
# the text ICE BRENT AFTERNOON MARKERS

# Find all the tables in the document
tables = bsObj.findAll('table', {'class':'table table-responsive table-data table-align-left'})

table_body = None

# Iterate over the tables looking at the previous paragraph
for table in tables:

   p = table.find_previous('p')

   # Check for the afternoon markers text
   if p.get_text() == 'ICE BRENT AFTERNOON MARKERS':

      print(p.get_text())

      # Extract that table  
      table_body = table.find('tbody')

      break

if table_body <> None:

   # Get all the rows
   rows = table_body.find_all('tr')

   # Print each row
   for row in rows:

      cols = row.find_all('td')

      print('%s %s %s' % (cols[0].text.strip(), cols[1].text.strip(), cols[2].text.strip()))

else:

   print('No markers!')

raw_input('Press Enter to continue ...')

##END##


Thursday, June 16, 2016

Tuesday, December 13, 2011

The Zen of Python


janeiros@harlie:~$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>>

Wednesday, August 10, 2011

Some notes on Python

Data structures
* tuples - ( ) - inmutable
* lists - [ ] - changeable
* dictionary - { } changeable # Hash concept in Perl
   keys()
   values()
   __contains__(key)

* Strings are immutable

* None
* True
* False

To access the last element: [-1] # Same idea as in Perl

For lists:
   append()
   extend()
   pop(0) or pop() # The last element of the list is removed

Monday, July 25, 2011

Reading pcap files with Python

#!/usr/bin/python

import sys
import pcapy
from impacket import ImpactDecoder, ImpactPacket
import re

def main(argv):

        try:
                cap = pcapy.open_offline(argv[1])

                (header, payload) = cap.next()

                while header:
                        (seconds, micros) = header.getts()

                        # Parse the Ethernet packet
                        decoder = ImpactDecoder.EthDecoder()
                        ether = decoder.decode(payload)

                        # Parse the IP packet inside the Ethernet packet
                        iphdr = ether.child()

                        # Parse the TCP packet inside the IP packet
                        tcphdr = iphdr.child()

                        # Get the source and destination IP addresses
                        src_ip = iphdr.get_ip_src()
                        dst_ip = iphdr.get_ip_dst()

                        if tcphdr.child() <> None:
                                body = tcphdr.child().get_packet()

                                isFIX = re.match('8=FIX', body)
                                if isFIX <> None:
                                        print "%d.%06d %s" % (seconds, micros, body)

                        (header, payload) = cap.next()

        except pcapy.PcapError:
                pass

if __name__ == "__main__":
        main(sys.argv)