syslog-ng match and filter is not working the way I want - syslog-ng

I have following messages
1)"customer1"," 5","0","".....
2)"customer2"," 5","0",""....
3)"customer3"," 5","0",""...
4)""," 5","0",""
5)""," 5","0",""
What I want to achieve is based on first value in double quotes I want to create folders and then writing logs in the respective folder only and whenever double quote is blank send those logs in Others folder.With the following configuration I am able to create folder like (customer1,customer2 and customer3). Problem Occurs when I have blank value at the first place like log 4 and 5.
syslog-ng.conf
filter c1 {match('(^"")' flags("store-matches") type("pcre") value("MESSAGE") );};
destination d1 {file("/opt/data/cef/other/${DAY}${MONTH}${YEAR}_other.log");};
log {source(s_udp);filter(c1);destination(d1);};
filter c2 {match('(?<=")([\w\s]*)(?=")' flags("store-matches") type("pcre") value("MESSAGE") );};
destination d2 {file("/opt/data/cef/$1/${DAY}${MONTH}${YEAR}_$1.log");};
log {source(s_udp);filter(c2);destination(d2);};
First filter checks if the first double quote is empty or just like "" and it writes those logs into Others folder.Problem is with the second filter it matches everything between "". So it works fine if it has value but misbehave if it is empty .So it writes this log into a file with the name 03_06_2017.log in /opt/data/cef folder. I am not sure why it is creating a separate file .
Please help .
Regards
VG

I think it would be easier to use a csv-parser: https://www.balabit.com/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/csv-parser.html
If the number of columns in the messages varies, and you only need the first column for your filter, then you can use the greedy flag to take care of the other columns.

Related

How to get the part of a string after the last occurrence of certain character?

I would like to have the substring after the last occurrence of a certin character.
Now I found here how to get the first, second or so parts, but I need only the last part.
The input data is a list of file directories:
c:\dir\subdir\subdir\file.txt
c:\dir\subdir\subdir\file2.dat
c:\dir\subdir\file3.png
c:\dir\subdir\subdir\subdir\file4.txt
Unfortunately this is the data I have to work it, otherwise I could list it using command prompt.
The problem is that the number of the directories are always changing.
My code based on the previous link is:
select (regexp_split_to_array(BTRIM(path),'\\'))[1] from myschema.mytable
So far I've tried some things in the brackets that came in to my mind. For example [end], [-1] etc.
Non of them are working. Is there a way to get the last part without rearranging my strings backwards, and getting the first part, then turning it back?
You can use regexp_matches():
select (regexp_matches(path, '[^\\]+$'))[1]
Here is a db<>fiddle.

How do I write an SSIS Expression to extract one folder name from a fully qualified file name

I have an SSIS package with an ForEach File Enumerate loop (Fully Qualified Name) with an FTP task within in.
The package when executed will go through the files in the subfolders within the following directory
C:\Test\Test2\ABC\
*.txt
And it will post the files to an FTP site.
I have a defined variable called #[User::Filename] within the foreach loop.
But there are folders within the FTP and I want the files to go to based on the Folder they are taken from on the C drive.
C:\Test\Test2\ABC\A\1.txt
C:\Test\Test2\ABC\B\2.txt
C:\Test\Test2\ABC\C\3.txt
File 1.txt should go to the FTP folder Called \FTP\A
File 2.txt should go to the FTP folder Called \FTP\B
File 3.txt should go to the FTP folder Called \FTP\C
My original thought was to make the remote path a variable and piggy back off the the foreach loop variable Fully qualified name.
To do this I created a variable called #[User::FilenameFTP] and inputted the following into the expression
"//FTP//" +
RIGHT(
(LEFT(#[User::Filename], ABS((FINDSTRING(#[User::Filename], "//", 5)))),
ABS((FINDSTRING(#[User::Filename], "//", 5)-1)) - ABS((FINDSTRING(#[User::Filename], "//",4)+1))
)
I thought this formula would give me the filename in the C drive which the file is coming from and I used this as the Remote Path variable within the FTP task. But when I run it the files still go into \FTP\ and not into the subfolders.
I ran a script task on this and the output isnt showing what I want either. What am I doing wrong? Can this not be done this way editing the variable within the foreach loop?
If your drive names are coming in (more or less) as you have them shown, then those should be backslashes ("\\") instead of forward slashes in your expression. Might not be the issue, but I changed them to play around with this.
Using the C folder string, in the expression as written, ABS((FINDSTRING(#[User::Filename], "\\", 5)-1)) and ABS((FINDSTRING(#[User::Filename], "\\",4)+1)) both evaluate to 19, so the expression comes down to RIGHT(<<String>>,0), and, from the documentation, If integer_expression is zero, the function returns a zero-length string.. So you're not appending anything to the end of the FTP base folder name.
Down and Dirty Fix
We could probably mess around with all that LEFT and RIGHT and FINDSTRING, but if you know that the folder name you're after will always be the fifth element in your fully qualified name (which your expression is already dependent on) you can get there faster just using TOKEN, and specifying the fifth element of your slash-delimited string:
"//FTP//" + TOKEN( #[User::Filename],"\\",5) +"//"
Which evaluates to //FTP//C//.
More Sustainable Fix
On the other hand, if you want to future-proof your code a little, in anticipation of the day that you add or eliminate a level of folder hierarchy, I would suggest extracting the last folder name, without regard to how many levels of folder come first.
We can do that using SUBSTRING and some clever REVERSE work, with due credit to KeithL for this answer, that got me rolling.
SUBSTRING takes three arguments. We have our string #[User::Filename], so that's one. The second is the starting position from the left end of the string, and the third is the number of characters to extract.
To get the starting position, we'll find the position of the second to last slash using REVERSE to count characters from the right hand end of the string:
FINDSTRING(REVERSE( #[User::Filename]),"\\",2) (evaluates to 8 here)
So our starting position is the total length of the string, minus the number of characters back to the second to last slash.
LEN( #[User::Filename]) - FINDSTRING(REVERSE( #[User::Filename]),"\\",2) (=17)
We can get the number of characters to pull by subtracting the reversed position of the last slash from the reversed position of the second to last slash, then subtracting one more, since we don't want that trailing slash in our string yet.
FINDSTRING(REVERSE( #[User::Filename]),"\\",2)
- FINDSTRING(REVERSE( #[User::Filename]),"\\",1) - 1 (= 1 in our example)
And there are our three arguments. Putting those all together with your base folder name (and I added a trailing slash. If that doesn't work for you, take it out of there!):
"//FTP//"
+ SUBSTRING(
#[User::Filename] ,
LEN( #[User::Filename]) - FINDSTRING(REVERSE( #[User::Filename]),"\\",2),
FINDSTRING(REVERSE( #[User::Filename]),"\\",2)
-FINDSTRING(REVERSE( #[User::Filename]),"\\",1)-1 )
+ "//"
Evaluates to //FTP//C//.
Now, when the powers that be decide to "clean up" that source server, and the Test2 layer disappears all of a sudden, or things get crazy, and you bury this all one layer deeper, your code will still work.
Side Note
If you're really using drive letters in your file path names, like C:\, be aware that when you're running your packages locally, that's your C:\ drive the package is using, but when you deploy this, it'll be looking around on the server's C:\ drive, and it just might not like what it finds, or more likely, doesn't find there.

Targeting a string for deletion with grep, sed, awk (or cut)

I am trying to parse some logs to gain the user agent and account id per line. I have already managed to pull the user agent and a string which contains the account id all on the same line.
The next step is to extract the account id from its longer string. I thought this would be fairly simple as I will know the start of the string and there are / slashes for the delimiter but the user agent also contains slashes and have varied number of fields.
The log file currently looks something like the following example but there are hundreds to thousands of lines to parse. Luckily I am working off a partition with plenty of space to spare.
USER_AGENT_PART ACCOUNT_ID_Part_/plus/path/to/stuff/they/access
some user agent/1.3 KnownString1_32d4-56e-009f98/some/stuff/here
user/agent KnownString1_12d3-345e-4c534/more/stuff/here
User/Agent cURL/1.5.0 KnownString2_12d34e56/stuff/things/stuff/stuff
one/User Agent/2.0 KnownString1_12d3_456e_7g8/more/random/stuff/stuff
So the goal is to keep the user agent part and the account id part and drop the path of the stuff they are accessing in the last string. But I can't use / or spaces as general delimiters because many user agents have / and various amounts of spaces in their name.
Also, the different types of user agents is way more than this little sample I have here. There are anywhere from 25 - 50 distinct types depending on the log. So it doesn't seem worth it to target the user agent and try to exclude it.
It seems the logical way to start is by targeting the part of the account ID which is a known string (KnownString1 or KnownString2) and grab everything from there (which is unknown numbers and letters with dashes) up until the first / of that account string.
Then I would delete the first / (In the account ID string) and everything after. I expect I will need to do this in two passes to utilize the two known parts of the user IDs.
This seemed like it would be easy but I just can't wrap my head around how to start targeting that last string. I don't even have a good example of something that is close to working because I don't know how to target the last string by delimiters without catching the same delimiters in the user agent part.
Any ideas?
Edit: Every line will have an account id that starts with one of two common KnownString_ in it but then is followed by a series of unknown digits and dashes until it gets to the first /. So I don't need to search for lines containing that before targeting the string.
Edit2: My original examples of the Account ID did not reflect there were letters mixed in with the numbers.
Edit3: Thanks to the responses from oguz ismail and kesubagu I was able to solve this using egrep. Looks like I was trying to make things more complicated than they were. I also realized I need to revisit grep as its capable of doing far more than what I tend to use it for.
This is what I ended up using which worked in one pass:
egrep -o ".+(KnownString1|KnownString2)_[^/]+" logfile > logfile2
Using grep:
$ grep -o '.*KnownString[^/]*' file
some user agent/1.3 KnownString1_32d4-56e-009f98
user/agent KnownString1_12d3-345e-4c534
User/Agent cURL/1.5.0 KnownString2_12d34e56
one/User Agent/2.0 KnownString1_12d3_456e_7g8
.* matches everything before KnownString, and [^/]* matches everything after KnownString until the first /.
You can use egrep with the -o option which will only output the part of that matches the provided regex, so you could do something like this
cat test | egrep -o ".+(KnownString1|KnownString2)_[_0-9-]+"
where the test file contains the input you've given, the output in this case was
some user agent/1.3 KnownString1_324-56-00998
user/agent KnownString1_123-345-4534
User/Agent cURL/1.5.0 KnownString2_123456
one/User Agent/2.0 KnownString1_123_456_78

How to use FILE_MASK parameter in FM EPS2_GET_DIRECTORY_LISTING

I am trying to filter files using FILE_MASK parameter in EPS2_GET_DIRECTORY_LISTING to reduce time searching all files in the folder (has thousands of files).
File mask I tried:
TK5_*20150811*
file name in the folder is;
TK5_Invoic_828243P_20150811111946364.xml.asc
But it exports all files to DIR_LIST table, so nothing filtered.
But when I try with;
TK5_Invoic*20150811*
It works!
What I think is it works if I give first 10 characters as it is. But in my case I do not have first 10 characters always.
Can you give me an advice on using FILE_MASK?
Haven’t tried, but this sounds plausible:
https://archive.sap.com/discussions/thread/3470593
The * wildcard may only be used at the end of the search string.
It is not specified, what a '*' matches to, when it is not the last non-space character in the FILE parameter value.

Limitting character input to specific characters

I'm making a fully working add and subtract program as a nice little easy project. One thing I would love to know is if there is a way to restrict input to certain characters (such as 1 and 0 for the binary inputs and A and B for the add or subtract inputs). I could always replace all characters that aren't these with empty strings to get rid of them, but doing something like this is quite tedious.
Here is some simple code to filter out the specified characters from a user's input:
local filter = "10abAB"
local input = io.read()
input = input:gsub("[^" .. filter .. "]", "")
The filter variable is just set to whatever characters you want to be allowed in the user's input. As an example, if you want to allow c, add c: local filter = "10abcABC".
Although I assume that you get input from io.read(), it is possible that you get it from somewhere else, so you can just replace io.read() with whatever you need there.
The third line of code in my example is what actually filters out the text. It uses string:gsub to do this, meaning that it could also be written like this:
input = string.gsub(input, "[^" .. filter .. "]", "").
The benefit of writing it like this is that it's clear that input is meant to be a string.
The gsub pattern is [^10abAB], which means that any characters that aren't part of that pattern will be filtered out, due to the ^ before them and the replacement pattern, which is the empty string that is the last argument in the method call.
Bonus super-short one-liner that you probably shouldn't use:
local input = io.read():gsub("[^10abAB]", "")