How to Pivot a single column source data in SQL? - sql

Below are the input and output details.Any database Oracle, SQL Server and MySQL should do for the answers.I am not able to derive the logic to rank data which will help me to pivot.
My source is a flat file which contains data like below.I have loaded that file into one of the tables in Oracle.
Source Input:
**Flatfile1**
**Coulmn1**
Kamesh
65
5000
123456789
Nanu
45
3000
321654789
Expected Output:
Name Age Salary Mobilenumber
Kamesh 65 5000 123456789
Nanu 45 3000 321654789
After loading into one of the tables I am applying the logic to number this data which will eventually look like below:
Column1 Datavalue
Kamesh 1
65 1
5000 1
123456789 1
Nanu 2
45 2
3000 2
321654789 2
However, I am not able to derive logic (I tried with Rank) which will give me sequence number like this without having any key field.Hope this explains situation.
Thanks!!

Oracle doesn't store the rows in order, if you do select * from table1 multiple times you could get rows in different orders according to db operations and caching
Therefore if you have a table like that with no other column it's impossible to "pivot" the data.
I strongly suggest to save data in a normalized form, if you can't consider adding a column with a row ID populated automatically (identity column in oracle 12, trigger+ sequence in previous version)
Once you have your rows in order it will be easy to organize your data

Related

Aggregating / Concatenation of very long Varchar2 strings and find key words in the text || Oracle

I have been given a task to develop a script/ function/ query to aggregate groups of rows in a table and then search for specific keywords in it. The column to be aggregated is a varchar2 column with size 3200 and some of the aggregated rows have lengths way beyond 5000.
(I understand that the size of varchar2 is 4000)
When I try to aggregate the data into a single column, it gives a "result of string concatenation is too long" error (ORA-01489)
I have tried inbuilt aggregators like LISTAGG, XMLAGG, and also some custom functions but I have been asked to prefer a SQL query over a function or procedure.
Once I can get the data to be aggregated, I have to then search through the rows for matching keywords.
(can't just search the rows without aggregating as some of the words are split across the rows, eg row1 ends with "KEYW" and row2 starts with "ORD" if I need to look for "KEYWORD" in the table
my table kind of looks like this (can't post the real table data, sorry),
id_1 | id_2 | name | row_num | description
1 5 A 0 this has so
1 5 A 1 me keyword
1 5 B 0 this is
1 3 E 0 new some
2 12 A 0 diff str
here the unique rows are identified using the first 3 columns and the 4th column lists the order in which these "description" strings need to be concatenated.
I would like to get the output as:
id_1 | id_2 | name | description (concated)
1 5 A this is **some** keyword
1 3 E new **some**
when looking for the keyword "some"
Please help as I am fairly new to DBs and any help will be highly appreciated.
Thanks & Regards
Kunal

SQL to return records that do not have a complete set according to a second table

I have two tables. I want to find the erroneous records in the first table based on the fact that they aren't complete set as determined by the second table. eg:
custID service transID
1 20 1
1 20 2
1 50 2
2 49 1
2 138 1
3 80 1
3 140 1
comboID combinations
1 Y00020Y00050
2 Y00049Y00138
3 Y00020Y00049
4 Y00020Y00080Y00140
So in this example I would want a query to return the first row of the first table because it does not have a matching 49 or 50 or (80 and 140), and the last two rows as well (because there is no 20). The second transaction is fine, and the second customer is fine.
I couldn't figure this out with a query, so I wound up writing a program that loads the services per customer and transid into an array, iterates over them, and ensures that there is at least one matching combination record where all the services in the combination are present in the initially loaded array. Even that came off as hamfisted, but it was less of a nightmare than the awkward outer joining of multiple joins I was trying to accomplish with SQL.
Taking a step back, I think I need to restructure the combinations table into something more accommodating, but I still can't think of what the approach would be.
I do not have DB2 so I have tested on Oracle. However listagg function should be there as well. The table service is the first table and comb the second one. I assume the service numbers to be sorted as in the combinations column.
select service.*
from service
join
(
select S.custid, S.transid
from
(
select custid, transid, listagg(concat('Y000',service)) within group(order by service) as agg
from service
group by custid, transid
) S
where not exists
(
select *
from comb
where S.agg = comb.combinations
)
) NOT_F on NOT_F.custid = service.custid and NOT_F.transid = service.transid
I dare to say that your database design does not conform to the first normal form since the combinations column is not atomic. Think about it.

SQL add column value based on another column ACCESS

What I'm trying to do is add another column to an existing table whose value will depend on an already existing column in the table. For example say I have this table:
Table1
|Letter|
A
C
R
A
I want to create another column (for example, numbers) that is chosen based on the letters. So let's say A corresponds with 10, C with 3 and R with 32 (this was chosen at random). My resulting table should be like this:
|Letter| Number |
A | 10
C | 3
R | 32
A | 10
Can anyone help me write a query that does this..I have over 20 different cases, so the simpler it looks the better.
Thanks in advance!
Options:
Build a table that associates [Letter] with the numeric value. Include this table in query by joining on the common [Letter] fields.
A very long Switch() expression. However, query design grid cell has a limit of 1024 characters.
Better to provide example with your real data and criteria.

How to insert uneven data rows into matrix in SAS?

I have an originations data set with loan ids. I then have a corresponding dataset with performance data for each of these loans ids, which can be anywhere from 10-40 rows in the performance data set.
The start date of each of the performance loans is not the same either, although some do overlap. What I want to do is take every loan id group in the performance data set, and then create a row of a certain column value across all occurrences in the data set. It doesn't matter if they start on different dates, I just want to align the values as this is the first value for loan id x and y.
For example:
ID Date Val
3 201601 100
3 201602 102
3 201603 103
--> Result:
ID Val1 Val2 Val3
3 100 102 103
I'm having two issues. One is the differing size of performance data for each id. I can't construct a matrix with differing lengths of rows. I'm assuming I'll need to append 0's to the end of each row to meet a predefined width.
My second issue is that I'm not sure how to read through a the performance data set to group loans, extract the value column, construct the column into a row for that id, and then insert into a matrix. I know how I would do this in Python but I need to use SAS. I can construct tables in SAS, but I'm not sure how to append rows, only columns.
If someone could provide some guidance on this it'd be a great help.
Anyone who runs into a similar issue it ended up being only a few lines of code.
proc transpose data = new_data
out = new_data1;
var trans_state;
by id;
run;
The output will be

Join More Than 2 Tables

I have three tables.
Table Data contains data for individual parts that come from a
"data.txt" file.
Table Limits contains the limits for the Data table
from a "limits.txt" file.
Table Files is a listing for
each individual .txt file above.
So the "Files" table looks like this. As you can see it is a listing of each file that exists. The LimitsA file will contain the limits for every Data file of type A.
ID File_Name Type Sub-Type
1 DataA_10 A 10
2 DataA_20 A 20
3 DataA_30 A 30
4 LimitsA A NONE
5 DataB_10 B 10
6 DataB_20 B 20
7 LimitsB B NONE
The "Data" table looks like this. The File_ID is the foreign key from the "Files" table. Specifically, this would be data for DataA_10 above:
ID File_ID Dat1 Dat2 Dat3... Dat20
1 1 50 52 53
2 1 12 43 52
3 1 32 42 62
The "Limits" table looks like this. The File_ID is the foreign key from the "Files" table. Specifically, this would be data for LimitsA above:
ID File_ID Sub-Type Lim1 Lim2
1 4 10 40 60
2 4 20 20 30
3 4 30 10 20
So what I want to do is JOIN the correct limits from the "Limit" table to the data from the corresponding "Data" table. Each row of DataA_10 would have the limits of "40" and "60" from the LimitsA table. Unfortunately there is no way to directly link the limits table to the data table. The only way to do this would be to look back to the files table and see that LimitsA and DataA_10 are of type A. Once I link those two together I then need to specifically only grab the Limits for Sub-Type 10.
In the end I would like to have a result that looks like this.
Result:
ID File_ID Dat1 Dat2 Dat3... Dat20 Lim1 Lim2
1 1 50 52 53 40 60
2 1 12 43 52 40 60
3 1 32 42 62 40 60
I hope this is clear enough to understand. It seems to me like an issue of joining more than 2 tables, but I have been unable to find a suitable solution online as of yet. If you have a solution or any advice it would be greatly appreciated.
Your 'Files' table is actually 2 separate (but related) concepts that have been merged. If you break them out using subqueries you'll have a much easier time making a join. Note that joining like this is not the most efficient method, but then again neither is the given schema...
SELECT Data.*, Limits.Lim1, Limits.Lim2
FROM (SELECT * FROM Files WHERE SubType IS NOT NULL) DataFiles
JOIN (SELECT * FROM Files WHERE SubType IS NULL) LimitFiles
ON LimitFiles.Type = DataFiles.Type
JOIN Data
ON DataFiles.ID = Data.File_ID
JOIN Limits
ON LimitFiles.ID = Limits.File_ID
AND DataFiles.SubType = Limits.SubType
ORDER BY Data.File_ID
UPDATE
To be more specific on how to improve the schema: Currently, the Files table doesn't have a clear way to differentiate between Data and Limit file entries. Aside from this, the Data entries don't have a clear link to a single Limit file entry. Although both of these can be figured out as in the SQL above, such logic might not play well with the query optimizer, and certainly can't guarantee the Data-Limit link that you require.
Consider these options:
Instead of linking to a 'Limit' file via Type, link directly to a Limit entry Id. Set a foreign key on that link to ensure the expected Limit entry is available.
Separate the 'Limit' entries from the 'Data' entries by putting them in a separate table.
Create an index on the foreign key. For that matter, add indices for all foreign keys - SQL Server doesn't do this by default.
Of these, I would consider having a foreign key as essential, and the others as modest improvements.