I want to read a list on a webpage using the BeautifulSoup module in python. The HTML code is as follows : ... Business Chinese ... I parse the document using BeautifulSoup. Now I want to loop through the elements of the list. So I use the next_siblings attribute as follows : first_element = soup.ul.li for items in soup.ul.li.next_sibli...

Error importing BeautifulSoup lib

stackoverflow.com - 2013-04-23 13:41:38 - Similar - Report/Block

I installed BeautifulSoup with the command: sudo easy_install BeautifulSoup4 I got the message: Searching for BeautifulSoup4 Best match: beautifulsoup4 4.1.3 Processing beautifulsoup4-4.1.3-py2.6.egg beautifulsoup4 4.1.3 is already the active version in easy-install.pth Using /Library/Python/2.6/site-packages/beauti fulsoup4-4.1.3-py2.6.eg...

Annoying Ctrl+M issue parsing python files

stackoverflow.com - 2012-03-29 02:20:09 - Similar - Report/Block

In a boost python embedding in C++, I have C++ parsing a python (via boost-python) file containing a simple function (termed a "command" in what follows), which in turn calls a C++ method to complete a certain implementation. While these seems ridiculous, we choose to do it for the advantages of logging and the other flexibilities that py...

python parsing .cfg file

stackoverflow.com - 2012-06-12 08:21:41 - Similar - Report/Block

I am working with python scripts. I imported the MODO setting data into file with the XML format My .cfg file looks like this: <?xml version="1.0" encoding="UTF-8"?> <camera> <Position> <X> 2.0 </X> <Y> 0.75 </Y> <Z> 4.0 </Z> </Position> " " so on...... Now,i wan...

Handling "class" attribute in Beautifulsoup

stackoverflow.com - 2011-02-18 11:58:10 - Similar - Report/Block

I'm having trouble parsing html elements with "class" attribute using Beautifulsoup. The code looks like this soup = BeautifulSoup(sdata) mydivs = soup.findAll('div') for div in mydivs: if (div["class"]=="stylelistrow"): print div I get an error on the same line "after" the script finishes. File "./beautifulcoding.py", line 130, in getl...

NLTK MaltParser won't parse

stackoverflow.com - 2012-03-01 22:39:44 - Similar - Report/Block

I am trying to use MaltParser from NLTK. I could get to the point of configuring the parser: import nltk parser = nltk.parse.malt.MaltParser() parser.config_malt() parser.train_from_file('malt_train.conll ') but when it comes to actual parsing, parser returns an error: File "<stdin>", line 1, in <module> File "/Library/Python/2...

Syntax error - Python re.search (character class, caret)

stackoverflow.com - 2012-03-24 19:44:58 - Similar - Report/Block

Scraping pages using BeautifulSoup; trying to filter out links that end in "...html#comments" Code follows: import urllib.request import re from bs4 import BeautifulSoup base_url = "http://voices.washingtonpost.com/thefix /morning-fix/" soup = BeautifulSoup(urllib.request.urlopen(bas e_url)).findAll('a') links_to_follow = [] for i in soup:...

Save tree to a variable with HtmlAgilityPack

stackoverflow.com - 2013-03-23 14:39:59 - Similar - Report/Block

I'm newcomer in C#. And I'm looking for similar function in HtmlAgilityPack. In Python parsing library called BeautifulSoup exist function called contents . How I can done this by means of HtmlAgility?...

parsing xml file using java for - android based application

stackoverflow.com - 2012-10-12 08:49:14 - Similar - Report/Block

I am new to parsing xml file in java. I have some idea of how to parse values from attributes and values resides from tags but in my XML the values resides in different location: Ram : 45% CPU : 49% Undecided : 6% This is my XML format, here I want to parse the percentage values from the XML. If anyone knows how to parse the values,...

Python: cut equivalent in Python?

stackoverflow.com - 2013-03-19 00:39:04 - Similar - Report/Block

I want to parse a path (not filename) by the forward slash. Below takes the full path "filename" and reads up to the 7th "/". EDIT: I realized the above was confusing when I stated filename. I meant, I needed to parse the full path. e.g. I could need the first 7 "/"s to the left and remove 5 trailing "/"s. Python: "/".join(filename.split(...

Bad link crashes Python IRC bot

stackoverflow.com - 2012-07-28 04:19:18 - Similar - Report/Block

My bot uses Beautiful soup to parse HTML, and also prints out the web page title of a link said in IRC. This all works except for one thing: If someone gives a dead/fake link, the bot crashes. The link grabber triggers when "http" is found, so for example if someone just said "http", it crashes because there's no response. Does anyone kno...

Installing easy_install, NOT SO EASY

stackoverflow.com - 2012-03-21 02:53:19 - Similar - Report/Block

I am trying to install easy_install in order to use BeautifulSoup... However I have no clue what my PATH directory is... when I run easy_install BeautifulSoup.. I get error: Not a recognized archive type: C:\docume~1\tom\locals~1\temp\weasy_inst all-w6haxs\BeautifulSoup-3.2.1.tar.gz I am guessing this has something to do with the P...

Efficient way of parsing fixed width files in Python

stackoverflow.com - 2011-02-06 15:54:41 - Similar - Report/Block

I am trying to find an efficient way of parsing files that holds fixed width lines. Example: first 20 chars represent a column, from 21:30 another one and so on. Let's assume that the line holds 100 chars. What would be an efficient way to parse a line into several components? I could use string slicing per line, but it's a little bit ugl...

Timetable Web Scraping with multiple tables (Python)

stackoverflow.com - 2013-03-15 12:40:14 - Similar - Report/Block

I'm just looking for some info regarding python web scraping. I'm trying to get all the data from this timetable and I want to have the class linked to the time its on at. Looking at the html there's multiple tables (tables within tables). I'm planning to use Google App Engine with Python (perhaps BeautifulSoup also). Any suggestions on...

Wsdl String Parsing Problem

codecall.net - 2012-07-27 20:13:02 - Similar - Report/Block

Hi, I'm trying to parse a WSDL string and I keep getting a SAXException. This is where the processing stops: Document doc = docBuilder.parse(new InputSource(new StringReader(wsdl))); 'wsdl' is the string I'm trying to parse. I have no experience at all in Java programming and I've been stuck on this for ages. Does anyone have a clue? Than...

XML parsing - the ^H character/symbol?

stackoverflow.com - 2012-07-12 20:36:12 - Similar - Report/Block

I'm having a really bad time trying to clean up some XML so I can parse it in Python with etree. Basically before my Python script reads it, I'm trying to escape all the special characters in each string entry that are giving me 'xml.parsers.expat.ExpatError: not well-formed' So while I'm generating the XML string entries, I'm using to r...

How to find the error line in HTML when HTMLParserError occurs

stackoverflow.com - 2012-05-21 10:48:33 - Similar - Report/Block

now i am writing a web crawler using python, but sometimes it throws HTMLParserError: junk characters in start tag: u'\u201dTPL_password_1\u201d\r\n\t\t', at line 21285, column 6 it said the error was found at line 21285, does it mean that the error is found at line 21285 in the HTML source code? if not, how can i know what is the current...

Parse human-format date ranges in Python

stackoverflow.com - 2012-04-26 21:25:46 - Similar - Report/Block

I have some human-style date ranges, in strings, like the following: 22-24th April 2012 14-23 July 20th June - 5th July I want to parse these in Python so that I can end up with two datetime objects: one for the start, one for the end. Is there any module that will let me do this? I've tried parsedatetime , and it looks like the evalRange...

Parsing in order with rangeOfString:

stackoverflow.com - 2012-03-26 01:58:51 - Similar - Report/Block

The issue I have is that I want to parse strings in order, so for instance, parsing "one three two", adding that to another string, and printing "one three two". I'm using rangeOfString: , but when I parse that string, it returns "one two three". I know that the order of parsing in my case is the placement of the statements, but how do...

boost::spirit. Parsing (name) (description) text to a map

stackoverflow.com - 2012-06-19 19:37:11 - Similar - Report/Block

I currently working on parsing db schema in text file in the next format: (table_name) (table_description) The delimiter between elements are double return ( I need to parse this to a map, using boost::spirit for parsing. The problem is that the table_description can also contain double returns ( The table_name has strict format, this i...

iPhone app hpple HTML Parsing fatal error: 'libxml/tree.h' file not found [2]?

stackoverflow.com - 2012-06-02 13:44:35 - Similar - Report/Block

I am trying to parse HTML URL content using hpple for iPhone app. I want to parse and get data from like this URL http://www.example.com/mobile/403.html . I have used Google and found hpple for HTML parsing . I got the sample HTML parsing hpple code from github . When i start to run the project the below error is occurring 'libxml/tree.h'...

python: URL parsing differences between urllib and urllib2

stackoverflow.com - 2013-06-12 00:14:14 - Similar - Report/Block

I'm trying out the following piece of Python code: import urllib import urllib2 url = 'https://AzureDiamond:hunter2@example.co m/my/rest/api.json' u = urllib.urlopen(url) # works fine u = urllib2.urlopen(url) # InvalidURL: nonnumeric port: 'hunter2@example.com' I've tried urllib.quote without any success, I suspect urllib2 and urllib are p...

Parsing Excel file without Apache POI

stackoverflow.com - 2012-04-20 17:09:10 - Similar - Report/Block

I know that we can use Apache POI to parse an Excel file and get data. But I heard of a strange thing that excel file can be passed in a similar way we parse CSV (Like just read the file from file Stream and separate each column value with a "comma" separator). When we parse Excel we have to use tab as a delimiter. Is it possible? If yes...

Angular JS slowness with IE and iPad

stackoverflow.com - 2013-05-21 16:44:15 - Similar - Report/Block

Starting using Angular, and noticing with the full DOM parsing that it does that it seems to be slow when included on a large page. This issue seems to be very relevant with IE, while not so much with FF or Chrome. Has anyone found a way to: Not parse the whole DOM Clever solution make angular work faster on IE9+ Is there anyway to have a...

Scala: Best way to parse command-line parameters (CLI)?

stackoverflow.com - 2010-02-23 04:27:09 - Similar - Report/Block

What's the best way to parse command-line parameters in Scala? I personally prefer something lightweight that does not require external jar. Related: Java library for parsing command-line parameters? What parameter parser libraries are there for C++? Best way to parse command line arguments in C#...

How can I do background parsing of data with Symfony2?

stackoverflow.com - 2012-06-03 10:21:45 - Similar - Report/Block

I write a web application in PHP with Symfony2. The user can upload a CSV file with data that is saved to the database. The parsing of each row of the CSV file last about 0.2 seconds because I make some requests to the Google Maps API. So when you upload a CSV file with 5000 rows, which is a realistic case in my app, it may take 16 minute...

Python: Parse ISO 8601 date and time from a string (using the standard modules)

stackoverflow.com - 2012-02-23 12:41:59 - Similar - Report/Block

I want to parse the date for entries given by SVN: svn list --xml https://subversion:8765/svn/Foo/tags/ If I am not mistaken it is given using the ISO 8601 standard. An example is: dateString = "2012-02-14T11:22:34.593750Z" I am using Python 2.7 and am looking for the best way to process is using the standard modules only. I think I´ll p...

Why does SimpleDateFormat parse incorrect date?

stackoverflow.com - 2013-03-11 10:25:19 - Similar - Report/Block

I have date in string format and I want to parse that into util date. var date ="03/11/2013" I am parsing this as : new SimpleDateFormat("MM/dd/yyyy").parse(dat e) But the strange thing is that, if I am passing "03-08- 201309 hjhkjhk " or "03- -2013" or -88-201378", it does not throw error , it parses it. For this now, I have to write rege...

How do I get parent and nested values using BeautifulSoup?

stackoverflow.com - 2012-07-07 19:06:19 - Similar - Report/Block

I'm using BeautifulSoup to extract categories and subcategories from a HTML page. The html looks like this: <a class='menuitem submenuheader' href='#'>Beverages</a><div class='submenu'><ul><li>& lt;a href='productlist.aspx?parentid=053& catid=055'>Juice</a></li> </ul></div> Where B...

Following Acquisition By Facebook, Parse Rolls Out Parse Hosting

allfacebook.com - 2013-05-07 15:31:22 - Similar - Report/Block

Cloud application platform Parse , which was acquired by Facebook in April , announced the launch of Parse Hosting Tuesday, joining its four existing offerings — Parse Data, Parse Social, Parse Push, and Cloud. continued… New Career Opportunities Daily: The best jobs in media...

how can parse this json file in iphone

stackoverflow.com - 2012-09-11 13:01:43 - Similar - Report/Block

am new to iphone programming can any please tel how parse the json in iphone .....using json parsing in my application and this is my json data is as follow : the file of josn format is like dz... "firstName": "John", "lastName": "Smith", "age": 25, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY"...

Using a ParseKit grammar to parse timestamps

stackoverflow.com - 2012-03-15 13:23:06 - Similar - Report/Block

I've a fairly simple question about ParseKit and parsing timestamps... how do I go about forcing the symbolic-nature of a dot/period. For example, if I am trying to parse 2008-01-25 , I could use something like date = /\d{4}/ '-' /\d{2}/ '-' /\d{2}/ . In fact, there is a date.grammar shipped with ParseKit that does exactly this (intere...

Java X509 Certificate parsing and validating

stackoverflow.com - 2012-04-06 14:17:34 - Similar - Report/Block

I'm trying to process X509 certificates in several steps and running into a couple of problems. I'm new to JCE so I not completely up to date on everything yet. We want to be able to parse several different X509 certificates based on different encodings (PEM, DER and PCKS7). I've exported the same certificate from https://belgium.be in P...

Saxparser exception in Android

stackoverflow.com - 2012-03-25 03:47:24 - Similar - Report/Block

I am parsing data from this XML url The text input varies, depending on the user. Whenever there are spaces in the text variable I get this exception: org.apache.harmony.xml.ExpatParser$Parse Exception: At line 11, column 2: mismatched tag org.apache.harmony.xml.ExpatParser.parse Fragment(ExpatParser.java:520) org.apache.harmony.xml.Exp...

how to find the authentication used on a website

stackoverflow.com - 2012-03-02 06:23:08 - Similar - Report/Block

I've been reading about beautifulSoup, http headers, authentication, cookies and something about mechanize. I'm trying to scrape my favorite art websites with python. Like deviant art which I found a scraper for. Right now I'm trying to login but the basic authentication code examples I try don't work. So question, How do I find out what...

Parsing/edition docx file with PHP

stackoverflow.com - 2013-04-19 13:30:29 - Similar - Report/Block

I've been asked to write a php script that should read/parse a docx file and do some operations such as duplicate a specific paragraph/table and fill-in some variables (#myvar or $myvar) with values. What do you guys recommand, use the word/document.xml file directly or convert the whole document to an HTML file and then parse it using DO...

Parsing a pwdump file python

stackoverflow.com - 2012-07-04 03:58:25 - Similar - Report/Block

I'm trying to parse a pwdump file in python. The content of a pwdump file looks like this: ...[snip] Domain\TESTIN$::aad3b435b51404eeaad3b435 b51404ee:31d6cfe0d16ae931b73c59d7e0c089c 0::: Guest(current):501:aad3b435b51404eeaad3b 435b51404ee:31d6cfe0d16ae931b73c59d7e0c0 89c0::: Guest(hist_01):501:aad3b435b51404eeaad3b 435b51404ee:31d6cfe0d16ae9...

How to Json Parsing In asp .net

stackoverflow.com - 2012-03-24 11:45:20 - Similar - Report/Block

http://www.taxmann.com/TaxmannWhatsnewSe rvice/Services.aspx?service=gettopstorie stabnews This is my web service I have to parse and store all value in String please help me how to parse. using asp.net (C#) so that I can store : news_id as it variable news_title as title variable news_short_description as description news_date as date...

timezone parsing discrepancies between ruby versions

stackoverflow.com - 2013-02-27 12:02:10 - Similar - Report/Block

In my local machine, I am using RVM >> ruby -v => ruby 1.9.3p194 (2012-04-20 revision 35410) [i686-linux] # in the terminal >> date => Wed Feb 27 20:00:17 PHT 2013 In our staging server, we are using rbenv >> ruby -v => ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux] # in the terminal >> date => Wed Feb 27 12:00:22 UTC 2013 In m...

Red Hat 5/CentOS 5系统中更新python版本

linuxde.net - 2013-02-28 03:41:15 - Similar - Report/Block

CentOS Python 的版本默认为2.4.3版本,而我们有很多工作可能需要更高的版本,而去Python 官网的时候发现,Python的版本已经到3.3了,于是下载并升级了下CentOS 中Python的版本。 1,下载并安装 wget http://python.org/ftp/python/3.3.0/Pytho n-3.3.0.tar.bz2 tar -jxvf Python-3.3.0.tar.bz2 cd Python-3.3.0 ./configure make && make install 2,更新链接 mv /usr/bin/python /usr/bin/python-2.4.3.bak ln -s /usr/local/bin/python3.3 /u...


