SQL Stored procedure extract data - sql

Is this possible in a Oracle 11g SQL stored procedure? I need to get the data base on the list.txt with field ID. 1,3,5 to query in customer table and export the result in result.txt
Table name: Customer
ID|Name |Country
1 |Mark |USA
2 |Allan|UK
3 |James|USA
4 |Todd |UK
5 |Mike |UK
File: list.txt
ID
1
3
5
Result exported in text file
result.txt
1 |Mark |USA
3 |James|USA
5 |Mike |UK

This link should help you read data from the | separated file. The example shows hw to read from a CSV, however it is just a matter of replacing the ,(comma) character within the example with a |(pipe) character.
The link also contains a reference that shows how to create CSV files using PL-SQL. Again, here all you need to to do is to replace the ,(comma) character within the example with a |(pipe) character.
Also, you need to ensure that you have permissions to read from and write to the folder locations that contain the text files you want to process.

Related

How do I spilt airbnb amenities data into multiple rows in Tableau Prep?

I need to split the following type of data into separate rows in tableau Prep Builder. There are only 2 columns: listing ID and Amenities like so:
Listing ID | Amenities
------------------------------------------------
123 | ["Oven", "Blankets", "WiFi", "Dryer", etc... ]
I would like to split them up like so:
Listing ID | Amenity
_________________________
123 | Oven
123 | Blankets
123 | WiFi
123 | Dryer
etc....
Is this possible?
I can split the amenities data into columns ok but this is not what I need.
If this is not possible in Prep Builder then I am ok with importing the data into SQLServer and using a SQL query if thats possible.
For those who come to this later on and are looking for the same. This is how I achieved what I needed:
Remove leading and trailing brackets from the string using RIGHT and LEFT functions
Remove all quotes using REPLACE
Refer to this link then for the rest: Turning a Comma Separated string into individual rows

Transpose/Pivot Excel file in Pentaho (using multiple files)

I've been having some trouble with the following situation: There's an Excel file I need to use which has the information in the following format:
ColumnA | ColumnB
Name | John
Business | Pentaho
Address | Evergreen 123
Job type | Food processing
NameBoss | Boss lv1
Phone | 555-NoPhone
Mail | thisATmail
What I need to do is get all column A as different columns, ending with 7 different columns, each one with one value, which is the data in column B. Additionally, the integration is reading the filename as an extra output field:
SELECT
'${FILES_ROOT}/proyectos/BUSINESS_NAME/B_NAME_OPER/archivos_fuente/NÓMINA BAC - ' ||nombre_empresa||'.xlsx' as nombre_archivo
--, nombre_empresa
FROM "public".maestro_empresa
The transformation for the Excel file I have it as this:
As can bee seen, in the fields tab of the transformation, added manually each column, since the data in the Excel file does not has headers.
With this done, I am not sure how to proceed from here in order to get the transposed data I need. What can I do?
End result I am looking forward is something like this:
Name | Business | Address | Job type | NameBoss | Phone | Mail | excel_name
John | Pentaho | Evergreen 123 | Food processing | Boss lv1 | 555-NoPhone | thisAtMail | ExcelName.xlsx
With step 'Row demoralizer', you can do this easily. AT first you need to take input from excel file -> you need to use 'Row demoralizer' step. You can see sample from HERE.
Note: Remove ''Id'' column from my sample if you always suppose to get one line.
If you ColumnA values are dynamic /not specific . You can use THIS Metadata Injection sample ( where you need to take same excel input twice. But not require to specify column name). Please run transformation "MetaDataInjectionPV.ktr"

Transpose variable number of rows into columns in OpenRefine

I have an xml file containing records from a library catalogue. I have imported it into OpenRefine but all the values are in one column. I want to transpose it so each field in the record has its own column. However, this is complicated by the fact that a) each field is optional so does not exist in all records and b) many fields are repeatable so can appear multiple times in each record. Here's a simplified example of what the data looks like:
| RecordID | Tag | Data |
| 1 | 040a | CaABCD |
| 1 | 245a | Go fish |
| 1 | 245a | A guide to fish |
| 1 | 246i | Fish series |
| 1 | 260a | Fishing friends |
| 2 | 040a | CaABDC |
| 2 | 245a | Happy trails |
| 2 | 246i | Hiking series |
| 2 | 260i | The happy hiker |
| 2 | 500a | Notes |
I have read the Q&A here Openrefine - Transpose rows into columns based on text but the problem with this solution is that if I concatenate all the values together I have no way to be sure what field they belong in anymore, as my data is much more complicated than the data in that question (my actual data has 25+ fields and many thousands of records).
I was able to get closer using Google Sheets and making a pivot table with a calculated field (as in PivotTable to show values, not sum of values - see the answer at the very bottom). However, I still don't know how to handle the repeating fields. In the pivot table the multiple values are there but only the first displays (double-clicking on an individual cell brings up a details table which lists all the values), so when I copy-paste the table I lose the additional values. I would like to concatenate them but I cannot see a way to do so within the pivot table.
Can you think of any other way I could do this, in OpenRefine or another tool? Thanks!
The classic way to fix this in OpenRefine is to use "Transpose -> Columnize by key value". But this feature is poorly documented and can cause headaches even for OpenRefine developers. In your case, repeated fields will be problematic, so here is a possible solution.
1° Go to the "tag" column, click on "Transpose -> Columnize by key value" and use the following configuration (don't forget the "Note column (optional)")
The result will look like this (my dataset is not exactly the same as yours, I modified a value to do some test)
2° In the new column "Record ID: 040 a", click on "edit column -> Move Column To Beginning".
3° If you want to merge the repeated fields, go to each column that contains them and click on "Edit Cells -> Join Multi Value cells" by choosing a separator, for example "|".
The end result will look like this.
To get rid of unnecessary columns: Click on Export -> Custom tabular export and deselect the columns whose name starts with RecordId.
OpenRefine also has a native MARC importer which might be something worth trying if you need to work with MARC data in the future. MARCEdit also has some specific OpenRefine support built in.

Export varbinary to file (image) from multiple rows

I have a MSSQL 2k8 database, in it I have a table of format below.
Employee Number | Segment | Data (varbinary(8000))
----------------------------------------------------------
111111 | 1 | 0x01234567...DEF
111111 | 2 | 0x01234567...DEF
111111 | 3 | 0x01234567...DEF
The data (varbinary) column makes up a picture but unfortunately is split in multiple segments by a process I cannot control.
Is there a way to export this data via an SQL script/procedure to a file? I have seem some questions that answer for a varbinary(max) column but I can't for the life of me work out how to stitch these all together into one file.
Note: Some of the files have >500 segments but this procedure will not be occuring exceedingly regularly.
If the picture can be reconstructed by simply concatenating all of the segments, then you could try execsql.py, which is a SQL script processor written in Python (by me). It has a metacommand of this form:
EXPORT <table_or_view> TO <filename> AS RAW
which will concatenate all columns and rows in the given table or view.

Flat File Import: Remove Data

(Posted a similar question earlier but HR department changed conditions today)
Our HR department has an automated export from our SAP system in the form of a flat file. The information in the flat file looks like so.
G/L Account 4544000 Recruiting/Job Search
Company Code 0020
--------------------------
| Posting Date| LC amnt|
|------------------------|
| 01/01/2013 | 406.25 |
| 02/01/2013 | 283.33 |
| 03/21/2013 |1,517.18 |
--------------------------
G/L Account 4544000 Recruiting/Job Search
Company Code 0020
--------------------------
| Posting Date| LC amnt|
|------------------------|
| 05/01/2013 | 406.25 |
| 06/01/2013 | 283.33 |
| 07/21/2013 |1,517.18 |
--------------------------
When I look at the data in the SSIS Flat File Source Connection all of the information is in a single column. I have tried to use the Delimiter set to Pipe but it will not separate the data, I assume due to the nonessential information at the top and middle of the file.
I need to remove the data at the top and middle and then have the Date and Total split into two separate columns.
The goal of this is to separate the data so that I can get a single SUM for the running year.
Year Total
2013 $5123.25
I have tried to do this in SSIS but I cant seem to separate the columns or remove the data. I want to avoid a script task as I am not familiar with the code or operation of that component.
Any assistance would be appreciated.
I would create a temp table that can import the whole flat file, after that do filter on SQL level
An example
Create TABLE tmp (txtline VARCHAR(MAX))
BCP or SSIS file into tmp table
Run Query like this to get result ( you may need adjust string length to fit your flat file)
WITH cte AS (
SELECT
CAST(SUBSTRING(txtline,2,10) AS DATE) AS PostingDate,
CAST(REPLACE(REPLACE(SUBSTRING(txtline,15,100),'|',''),',','') AS NUMERIC(19,4)) AS LCAmount
FROM tmp
WHERE ISDATE(SUBSTRING(txtline,2,10)) = 1
)
SELECT
YEAR(PostingDate),
SUM(LCAmount)
FROM cte
GROUP BY YEAR(PostingDate)
maybe you could use MS-Excel to open the flat file, using pipe-character as the delimeter, and then create a CSV from that, if needed.
Short of a script task/component (or a full-blown custom SSIS component), I don't think you'll be able to parse that specific format in SSIS. The Flat File Connection Manager does allow you to select how many rows of your text file are headers to be skipped, but the format you're showing has multiple sections (and thus multiple headers). There's also the issue of the horizontal lines, which the Flat File Connection won't be able to properly handle.
I'd first see if there's any way to get a normal CSV file with this data out of SAP. If that turns out to be impossible, then you'll need some sort of custom code to strip out the excess text.