I have over a million text files compressed into 40 zip files. I also have a list of about 500 model names of phones. I want to find out the number ... |
How would one write a regular expression to use in python to split paragraphs?
A paragraph is defined by 2 linebreaks (\n). But one can have any ammount of spaces/tabs together with ... |
I'm having a bit of trouble getting a Python regex to work when matching against text that spans multiple lines. The example text is ('\n' is a newline)
some Varying TEXT\n
\n
DSJFKDAFJKDAFJDSAKFJADSFLKDLAFKDSAF\n
[more of ...
|
I'm trying to parse the title tag in an RSS 2.0 feed into three different variables for each entry in that feed. Using ElementTree I've already parsed the RSS so ... |
I'm trying to handle a bunch of files, and I need to alter then to remove extraneous information in the filenames; notably, I'm trying to remove text inside parentheses. For example:
filename ...
|
I need to parse a configuration file which looks like this (simplified):
<config>
<links>
<link name="Link1" id="1">
<encapsulation>
<mode>ipsec</mode>
</encapsulation>
</link>
<link name="Link2" id="2">
<encapsulation>
<mode>udp</mode>
</encapsulation>
</link>
</links>
My goal is to be able to change ... |
Newbie to Python.... help requested with the following task :-)
I have tree of various files, some of them are C source code.
I would like to modify these C files with python ... |
|
Sample Text:
SUBJECT = 'NETHERLANDS MUSIC EPA'
CONTENT = 'Michael Buble performs in Amsterdam Canadian singer Michael Buble performs during a concert in Amsterdam, The Netherlands, 30 October 2009. Buble released his new ...
|
Ok so i have this piece of code:
def findNReplaceRegExp(file_name, regexp, replaceString, verbose=True, confirmationNeeded=True):
'''Replaces the oldString with the replaceString in the file given,\
returns the number of replaces
'''
...
|
I need to parse text lists:
1 List name
1 item
2 item
3 item
2 List name
1 item
2 item
3 item
3 List name
1 item
2 item
3 item
I was trying to use regular expression to split first level ... |
I've been looking at the re documentation and at other questions but I keep running into trouble with regex.
I need to take what ever is in the [tag] off of ... |
What are common approaches for translating certain words (or expressions) inside a given text, when the text must be reconstructed (with punctuations and everythin.) ?
The translation comes from a lookup table, ... |
When you try opening a MS Word document or for that matter most Windows file formats, you will see gibberish as given below broken intermittently by the actual text. I need ... |
I have a large textfile, which has linebreaks at column 80 due to console width. Many of the lines in the textfile are not 80 characters long, and are not affected ... |
I needed to strip the Chinese out of a bunch of strings today and was looking for a simple python regex. Any suggestions?
|
Trying to write a code that searches hash values for specific string's (input by user) and returns the hash if searchquery is present in that line.
Doing this to kind of ... |
Basically, I have a file like this:
Url/Host: www.example.com
Login: user
Password: password
Data_I_Dont_Need: something_else
How can I use RegEx to separate the details to ... |
I have a text file that has sets of text I need to extract that looks something like as follows:
ITEM A blah blah blah ITEM B bloo bloo bloo ITEM A ...
|
I'm trying to match a "#" followed by letters if and only if it's preceded by newline, whitespace or is the first character in a string. The first two I've done, ... |
Given a string of text, in Python:
s = "(((((hi abc )))))))"
s = "***(((((hi abc ***&&&&"
How do I replace all non-alphabetic symbols that occur more than 3 times...as blank string
For all the ... |
I have a list of approximately 300 words and a huge amount of text that I want to scan to know how many times each word appears.
I am using the |
How can one change the following text
The quick brown fox jumps over the lazy dog.
to
The quick brown fox +
jumps over the +
lazy dog.
using ... |
I need a regex that matches
re.compile('userpage')
href="www.example.com?u=userpage&as=233&p=1"
href="www.example.com?u=userpage&as=233&p=2"
I want to get all urls that have u=userpage and p=1
How can I modify the regex above to find both u=userpage and p=1?
|
I am trying to convert a file which contains ip address in the traditional format to a file which contains ip address in the binary format.
the file contents are as follows.
src-ip{ ... |
I am trying to process various texts by regex and NLTK of python -which is at http://www.nltk.org/book-. I am trying to create a random text generator and I am ... |
I'm trying to filter out some text for certain keywords that are found in a text file. I was thinking about just parsing the file line by line, take each word ... |
there is more elegant (pythonic + effective) way to find word on given position?
FIRST_WORD = re.compile(r'^(\w+)', re.UNICODE)
LAST_WORD = re.compile(r'(\w+)$', re.UNICODE)
def _get_word(self, text, position):
"""
...
|
I am trying to extract a section of text that looks something like this:
Thing 2A blah blah Thing 2A blah blah Thing 3
Where the "3" above could actually be ANY single ... |
I have a text, in which only <b> and </b> has been used.for example<b>abcd efg-123</b> . Can can I extract the string between these tags? also I need to extract 3 ... |
I have text which shows course numbers, names, grade and other information for courses taken by students. Specifically, the lines look like these:
0301 453 20071 LINEAR SYSTEMS I ...
|
I would like to enable all apt repositories in this file
cat /etc/apt/sources.list
## Note, this file is written by cloud-init on first boot of an instance ...
|
I have to replace text this way:
Some data la-la-la [image=test.png] next data...
Some data la-la-la 123 [image=test2.png]
And replace that with:
Some data la-la-la test.png next data...
Some data la-la-la 123 test2.png
I tried with re.sub ... |
Hey there.
I'm using glob.glob function in order to get a list of all .txt files on a path that i provide.
The regex I'm feeding the function as C:\build\*.txt, but it works ... |
I have formatted lines of the text, i.e.
[[item1 *,* {_item2*} *;{item3*}* ;{item4*}*]]
where * means any text between the words and brackets.
Is it possible to collect text from * to variables?
item1, after1, ...
|
u'abcde(date=\'2/xc2/xb2\',time=\'/case/test.png\')'
All I need is the contents inside the parenthesis.
|
I'm trying to change wikitext into a normal text using python regular expressions substitution. There are two formatting rule regarding wiki link.
- [[Name of page]]
- [[Name of page | Text to display]]
(http://en.wikipedia.org/wiki/Wikipedia:Cheatsheet)
Here ... |
I'm using python 2.6 on linux.
I have two text files
first.txt has a single string of text on each line. So it looks like
lorem
ipus
asfd
The second file ... |
I was wondering how to convert text similar to the following:
Chapter 3 Convex Functions 97
3.1 Definitions 98
3.2 Basic Properties 103
to:
("Chapter 3 Convex Functions 97" "#97")
("3.1 Definitions 98" "#98")
("3.2 Basic Properties 103" ...
|
Please advise - I'm going to use this asa learning point. I'm a beginner.
I'm splitting a 25mb file into several smaller file.
A Kindly guru here gave me a Ruby sript. It ... |
Given a file of text, where the character I want to match are delimited by single-quotes, but might have zero or one escaped single-quote, as well as zero or more tabs ... |
I have created a software package that produces a directory full of results. I would like to test the results from some standard input files. The directories should be somewhat similar ... |
I want to remove text inside <p> tags for a block of html text. I am trying to standardize some text and remove all class, align, and other information. ... |
I need to modify all files that has a ".txt" extension within a directory in the following way:
remove all text lines beginning with the line that starts with "xxx" and the ... |
When I print the group "print(a)" the entire group is shown. When I save it to a text file "open("sirs1.txt", "w").write(a)" only the last row is saved to the file.
import re
def ...
|
So, I am successfully matching and extracting some special tagged text using the following regular expression:
theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
p = re.compile("\%%v:([0-9]*)%%")
theIds ...
|
I have a text file running into 20,000 lines. A block of meaningful data for me would consist of name, address, city, state,zip, phone. My file has each of these on ... |
I am working on some code to parse text into XML. I am currently using java and jaxb to handle the XML and the in-program representation of my data. I need ... |
Hey
I am kinda to python programming
I have no clue how to solve that task.
for line in f: br
m = re.search("f(\S+\s+,\s+\S+)", "56 ...
|
I'd like to retriev the Auth= value from the multiline string below. I've tried Python re.match with no success. Will appreciate any help I can get
SID=DQAAALsAAABCeyCMlOaYMHkv55TUQFxA71fxE1LpgpmL1G_o8YennFwBhar2I_LNmJjGjvLHVQy8tSRfYdLnUIHhKyD0FTZBzXyG_s8U4Pt97n9hPz68ZFSM42Qv6Qxuk74TQygHJXhjLWXNuD5mMsh8_MAs-nmhSToNFIyWoP-uTZ_LN2yQS1o9MB43fzuIIxp-1euXGxMceVVrjyidrYeEB13HS5kMHH-HGjiZhoIJBmu5es7pLPj9Ie8NJZ1K3kFhdVEJa4sLSID=DQAAAL4AAACypRIVyVXcs5zYIeUEt9v-wEwPKgQ8Oe23_URsDeHCg-rR2qQK4dTxPV1J6BPTO-6Zly2H9t4sVhm0vHe8IT6sKLdX2IQ8PgGMtSHQNkpQ8zEan0CyFyUetbSW4af6mlk2pksDpvXNm5GtNTj5eTwkCQUmgGep42u5iuCGFy-o9a1cQWz45NO_J8zIYnBdOqlheNTqaMWpi4hpr-_u8Muzs4RjlEbkuYfDu7MrdsJAFwxf0BVW2cGBtB-K2jwaK7w*Auth=*873hdyjsbcvuei73hckwoxnaodbc8dnskc8HU1mKRqxh6yEU-9tqx148GqC7h90_190ZzxpEZOHAH5HTptliylRXvMPyqPyijMNu21bOA6ZhvZFuL8YNB3KF63YuV0n5TFJd1-rMI2LQIdPMVBnsxnEGrLIeFOugAFCZ_3OelAc4XjeKdDvIowxkNnvaooXT4kxtkQWzieA3JRKy3Y-Lbi7E0qiXC99GtHVDh5VWvdTs2LCv3wnRULtLp6 |
I have a huge text file, each line seems like this:
Some sort of general menu^a_sub_menu_title^^pagNumber
Notice that the first "general menu" has white spaces, the second part (a subtitle) each ... |
I have the following problem: I'm trying to check whether some text consists only of a number of repetitions of some pattern. I.e I have a 1000 lines text and want ... |
Regular expressions are highly unreadable and difficult to debug. Does there exist any replacement for text processing which could be handled by mere mortals?
Criteria include
- It's a library or a tool (please ...
|
I have a expression and I want to extract it in python 2.6. Here is the example:
[a]+[c]*0.6/[b]-([a]-[f]*0.9)
this going to:
(
'[a]',
'+',
'[c]',
'*',
'0.6',
'/',
...
|
Hi I have the following text.
x = """Hello, this is a\nmultiline text\nend.Hello, this is\nthe second
chunck\nend."""
This pattern of Hello, \nend. keeps on repeating. I want to ... |
I have a text file in the following format:
DELIMITER1
extract me
extract me
extract me
DELIMITER2
I'd like to extract every block of extract mes between DELIMITER1 and DELIMITER2 in the .txt file
This is my current, ... |
I have the following text:
#{king} for a ##{day}, ##{fool} for a #{lifetime}
And the following (broken) regex:
[^#]#{[a-z]+}
I want to match all #{words} but not the ##{words} (Doubling '#' acts like escaping) . ... |
I am trying to replace the Nth appearance of a needle in a haystack. I want to do this simply via re.sub(), but cannot seem to come up with an ... |
Text file 1 has the following format:
'WORD': 1
'MULTIPLE WORDS': 1
'WORD': 2
etc.
I.e., a word separated by a colon followed by a number.
Text file 2 has the following format:
'WORD'
'WORD'
etc.
I need to ... |
I'm trying to write a script that will search through a html file and then replace the form action. So in this basic code:
<html>
<head>
...
|
I am trying to get some data out of text file with the following format:
jvm: 2011-08-29 17:09:54.438864:
MemoryStatistics: [290328680, 381288448]
moniData: 2011-08-29 17:09:54.438864:
Depth: [0]
...
|
Trying to do a search/replace in python using wildcards on the contents of a text file:
If the contents of the text file looks like:
"all_bcar_v0038.ma";
"all_bcar_v0002.ma";
"all_bcar_v0011.ma";
"all_bcar_v0011.ma";
Looking to replace all the version numbers with ... |
So I am trying to search for a certain string which for example could be:
process.control.timeout=30, but the 30 could be anything. I tried this:
for line in process:
line ...
|
I have looked back and forward for a possible solution to my issue, but I guess my google-fu is very poor today. Not to mention my knowledge of regular expressions, which ... |
I'm a python newbie. My script (below) contains a function named
"fn_regex_raw_date_string" that is intended to convert
a "raw" date string like this: ... |
Possible Duplicate:
My regex is not working properly
Suppose I have long text. From following text I need only abstract part. How do I avoid text ... |
In python I have big text in multline.
I need to get the text between {{book and }}
I tired using regular expression
problem is text inside is in mutiline string
I tried ... |
I have written following regex But its not working. Can you please help me? thank you :-)
track_desc = '''<img src="http://images.raaga.com/catalog/cd/A/A0000102.jpg" align="right" border="0" width="100" height="100" vspace="4" hspace="4" />
...
|