Oracle Ingest entire directory/file attributes into an Oracle table - sql

We have large number of batch jobs which generate ascii log files. We browse these logs using commands/tools such as GREP or VI/VIM editors.
Simple text searches are fine. But if we have to search these log files for a particular string and compare date/times when the string was generated - the task become unwieldy. It means finding the file which has the string and manually noting down modification date/time of the file.
Are there any tools which can ingest an entire directory structure on linux and store it into an Oracle table with following attributes:
1. Full name of directory
2. Filename
3. File Modification Date/time
4. Line number
5. Line Text

Related

How to extract sql code from Knime Nodes?

Is there a way to automatically extract code from nodes and save it in .sql or .txt files?
I'm using mostly Database SQL Executor (legacy) nodes where I have sql queries.
I've found that there is settings.xml file for every node in which I can see code as a value for key="statement", maybe I could use XML Reader and XPath nodes somehow?
I would like to have .sql or .txt file for every node, that file should contain sql code that is pasted in that particular node. It would be great if I could choose a name of that file as name of a node.
The Vernalis community contribution has a DB to Variable node. If you change the input port type to DB Data Port, then one of the output variables will be the SQL query. If you are using the Legacy nodes, then the corresponding Database To Variable (Legacy) node will to the same thing.
Once you have the SQL in the variable, you can use a Variable to Table Row node, and then e.g. the Vernalis Save File Locally node, or if you require further options, String to Binary Objects and Binary Objects to Files nodes will allow that
I've decided to share the idea of my solution to that problem, maybe someone else would like to do something like this in simillar way.
I had to work with RStudio and decided to write a script in Rcpp language (weird version of cpp that allows you to anchor R script in it).
Script has a path to the Knime workflow and iterates through every Node folder in search of "Databse SQL Executor" and "Database Reader" nodes.
Then extracts sql code and name of the node from settings.xml file.
After saving it to the variables it clears node name from signs not allowed in windows file names (like ? : | \ / etc) or stuff that xml added.
Same goes for sql code but instead of clearing xml stuff it changes it to the normal version of a sign (for example it changes %%000010 to \n or &lt to <)
When sql code is cleared and formated it saves the code in a .sql file with the name beeing name of the node.
It works pretty well and quite fast. One annying problem is that rcpp doesn't read UTF-8 signs so I had to clear them out from node names manually so the names are readable and not full of some nonesense.

how to read a tab delimited .txt file and insert into oracle table

I want to read a tab delimited file using PLSQL and insert the file data into a table.
Everyday new file will be generated.
I am not sure if external table will help here because filename will be changed based on date.
Filename: SPRReadResponse_YYYYMMDD.txt
Below is the sample file data.
Option that works on your own PC is to use SQL*Loader. As file name changes every day, you'd use your operating system's batch script (on MS Windows, these are .BAT files) to pass a different name while calling sqlldr (and the control file).
External table requires you to have access to the database server and have (at least) read privilege on its directory which contains those .TXT files. Unless you're a DBA, you'll have to talk to them to provide environment. As of changing file name, you could use alter table ... location which is rather inconvenient.
If you want to have control over it, use UTL_FILE; yes, you still need to have access to that directory on the database server, but - writing a PL/SQL script, you can modify whatever you want, including file name.
Or, a simpler option, first rename input file to SPRReadResponse.txt, then load it and save yourself of all that trouble.

Query for finding all occurrences of a string in a database

I'm trying to find a specific string on my database. I'm currently using FlameRobin to open the FDB file, but this software doesn't seems to have a properly feature for this task.
I tried the following SQL query but i didn't work:
SELECT
*
FROM
*
WHERE
* LIKE '126278'
After all, what is the best solution to do that? Thanks in advance.
You can't do such thing. But you can convert your FDB file to a text file like CSV so you can search for your string in all the tables/files at the same time.
1. Download a database converter
First step you need a software to convert you databse file. I recommend using Full Convert to do it. Just get the free trial and download it. It is really easy to use and it will export each table in a different CSV file.
2. Find your string in multiple files at the same time
For that task you can use the Find in files feature of Notepad++ to search the string in all CSV files located at the same folder.
3. Open the desired table on FlameRobin
When Notepad++ highlight the string, it shows in what file it is located and the number of the line. Full Convert saves each CSV with the same name as the original table, so you can find it easily whatever database manager software you are using.
Here is Firebird documentation: https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25.html
You need to read about
Stored Procedures of "selectable" kind,
execute statement command, including for execute statement variant
system tables, having "relation" in names.
Then in your SP you do enumerate all the tables, then you do enumerate all the columns in those tables, then for every of them you run a usual
select 'tablename', 'columnname', columnname
from tablename
where columnname containing '12345'
over every field of every table.
But practically speaking, it most probably would be better to avoid SQL commands and just to extract ALL the database into a long SQL script and open that script in Notepad (or any other text editor) and there search for the string you need.

Azure Power-shell command to get the Count of records in Azure Data lake file

I have set of files on Azure Data-lake store folder location. Is there any simple power-shell command to get the count of records in a file? I would like to do this with out using Get-AzureRmDataLakeStoreItemContent command on the file item as the size of the files in gigabytes. Using this command on big files is giving the below error.
Error:
Get-AzureRmDataLakeStoreItemContent : The remaining data to preview is greater than 1048576 bytes. Please specify a
length or use the Force parameter to preview the entire file. The length of the file that would have been previewed:
749319688
Azure data lake operates at the file/folder level. The concept of a record really depends on how an application interprets it. For instance, in one case the file may have CSV line or in another a set of JSON objects. In some cases files contain binary data. Therefore, there is no way at the file system level to get the count of records.
The best way to get this information is to submit a job such as a USQL job in Azure Data Lake Analytics. The script will be really simple: An EXTRACT statement followed by a COUNT aggregation and an OUTPUT statement.
If you prefer Spark or Hadoop here is a StackOverflow question that discusses that: Finding total number of lines in hdfs distributed file using command line

Kettle - Read multiple files from a folder

I'm trying to read multiple XML files from a folder, to compile all the data they have (all of them have the same XML structure), and than save that data in a CSV file.
I already have a 'read-files' Transformation with the steps: Get File Names and Copy Rows to Result, to get all the XML files. (it's working - I print a file with all the files names)
Then, I enter in a 'for-each-file' Job which has a Transformation with the Get Rows from Result Step, and then another Job to process those files.
I think I'm loosing information from the 'read-files' Transformation to the Transformation in the 'for-each-file' Job which Get all the rows. (I print another file with all the files names, but it is empty)
Can you tell me if I'm thinking in the right way? I have to set some variables, or some option that is disabled? Thanks.
Here is an example of "How to process a Kettle transformation once per filename"
http://www.timbert.net/doku.php?id=techie:kettle:jobs:processtransonceperfile