Trouble with opening files in python - file-upload

I was trying to write a program where I can upload a file that has first and last names in it and create a new file with the first letter of each first name followed by the last name.
The file that I created is a textfile and the lines would be like this:
firstname1 lastname1 (for example, john smith)
firstname2 lastname2 (for example jane jones)
firstname 3 lastname3 (for example jane doe)
etc...
I want to create a file that would look like this:
jsmith
jjones
jdoe
The issue that I am getting is that when I open the file in python it gives me all of these weird unwanted characters before getting to the actual text of the file. The book I am using to learn from doesn't say anything about this which is why i am posting here.
For example when I upload the file and run the following command:
newfile=open("example.file.rtf","r")
for i in newfile:
print(i)
I get this:
{\rtf1\ansi\ansicpg1252\cocoartf949\cocoasubrtf540
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;}
\margl1440\margr1440\vieww9000\viewh8400\viewkind0
\pard\tx560\tx1120\tx1680\tx2240\tx2800\tx3360\tx3920\tx4480\tx5040\tx5600\tx6160\tx6720\ql\qnatural\pardirnatural
\f0\fs24 \cf0 name 1\
name 2\
name 3 \
name 4 \
The actual text that I wrote in the textfile was just this:
name 1
name 2
name 3
name 4
Why is this happening? Why wouldn't it just show the plain text? If I can't get it to do that, how can I get around this issue for when I run loops through the file.

You are writing the file in RTF ("Rich Text") format, which is not plain text. Those "weird unwanted characters" are being written there by your editor. Use a plain text editor like Notepad to create your file, or explicitly save it as plain text.

Related

Replace certain text in lines with each line another file

I have a text file with text of this order;
str/4
<</Contents(100 cups)/(Date)
Colour red
<</Contents(080 bowls)/(Date)
Status used
Pack team
<</Contents(200 John)/(Date)
School house
And another text file with a list of words in the order;
Tree house
Colon format
Same variable
Now the question is, how do I search or match the text between "Contents(" and ")" in each line, ie. 100 cups, 080 bowls, 200 John and replacementreplace it with corresponding lines from my second file? . The final result should look like;
str/4
<</Contents(Tree house)/(Date)
Colour red
<</Contents(Colon format)/(Date)
Status used
Pack team
<</Contents(Same variable)/(Date)
School house

Full Text Search for extracting a snippet of the text (returning intended text and it's surrounding)

I'm using SQL file table and for instance I have a saved text file named "SOS.txt" which contains following text
For god's sake, save us right now please. We can't survive.
Now or never!
Now I want to find all files that contain the word save, so I execute following query
SELECT * FROM FileTableExample
WHERE CONTAINS(file_stream, 'save')
and here's the result:
stream file => 0x616C692053617665207573207269676874206E6F772E0D0A4E6F77206F72206E6576657221
As you can see I got the true result, the third column of the result indicates the file under name SOS.txt, I have the stream_id and stream_file but what I'm about to find is the way to show the the intended text in company with it's surrounding in human readable format.
Somethings like this:
Name | Excerpt
-------------+----------------------
SOS.txt |..sake, save us..
Is there any way?
Update:
After searching on the net I found this article which is useful but it didn't mention about full text search in filetable structure.
Based on this article, I converted file stream to string:
SELECT CONVERT(varchar(MAX), file_stream) AS Excerpt, *
from FileTableExample
where contains(file_stream, 'save')
It works if the file is a plain text like SOS.txt but if it's .docx or .pptx file, you are not going to gain a useful convention.
Use this, CAST(file_Stream as varchar(max))

How to forward logs with Splunk Forwarder for the files with no header and logs should be in form of key/Value

I have a splunk forwarder setup already on my host.
I have certain files in the folder (/tom/mike/). File names are starting with Back*.
The content of file may in one or multiple line. There are multiple fixed position values separated with some spaces in each line with no header.
Content (Example: Consider "-" as one space)
Tom---516-----RTYUI------45678
Mik---345-----XYXFF------56789
I need splunk logs for each line.
like:
Key1= Tom Key2=516 Key3= RTYUI Key4= 45678
Key1= Mike Key2= 345 Key3= XYXFF Key4= 56789
I know inputs.conf changes would be like below:
[monitor:///tom/mike/Back*]
index=myIndex
blacklist=\.(gz|zip|bkz|arch|etc)$
sourcetype = BackFileData
Please suggest changes which can be done in props.conf. Please keep in mind that delimiter is fixed for each value in line but its not same (like 2 spaces) for all column values. There are no headers as well in these files.
You can use kvdelims if you want a search-time extraction or you can make a transforms.conf rule and apply it in props.conf and it will extract at index time
Here's a good article covering all those scenarios
https://www.splunk.com/blog/2008/02/12/delimiter-based-key-value-pair-extraction.html

How to find the same words in two different text files and print those lines using bash?

I have two text files. One contain just one column of words. Hundreds of words. Just one word in every line. The second one contain a lot of columns a row.
I need to find the words from first text file which are in the second text file and print the entire line from second text file where this word is, using awk, grep or other command line program. For example:
Text file #1:
car
house
notebook
Text file #2:
32233: FTD laptop
24342: TGD car
2424: jdj notebook
Output:
24342: TGD car
2424: jdj notebook
try this:
grep -Fwf file1 file2

Parse text file into SQL table

How would I transform the following block of text which is excerpt from one application log in txt format:
ID: 1
Name: John
ID: 2
Name: Doe
into the following format:
ID Name
1 John
2 Doe
I'd write an AutoHotKey script to read the log file, parse it, and output to the new format.
You can use Loop, Parse to read the contents based on the colon : delimeter. It's quite neat!