SQL Server - Pivot Out Delimited Column Data Into Rows

SQL Server - Pivot Out Delimited Column Data Into Rows - sql

I have two columns of delimited data that I would like to pivot out into individual rows for each data item. In the starting table below, the delimited data is represented in the DataPointA and DataPointB columns. Also, note that each ID is a unique identifier for each person. The starting table looks like this:
----------------------------------------------------------
| ID | FirstName | LastName | DataPointA | DataPointB |
----------------------------------------------------------
| A1234 | Bill | Jones | 1,3,7,8 | 1,4 |
| B5678 | Jane | Smith | 2,4,6,9 | 1,5 |
----------------------------------------------------------
I would like to take the DataPoint column data that is delimited by commas and create one row for each DataPoint value, while also condensing into one field. So the end result will look like this:
-------------------------------------------------------------
| ID | FirstName | LastName | DataPoint | DataPointType |
-------------------------------------------------------------
| A1234 | Bill | Jones | 1 | A |
| A1234 | Bill | Jones | 3 | A |
| A1234 | Bill | Jones | 7 | A |
| A1234 | Bill | Jones | 8 | A |
| A1234 | Bill | Jones | 1 | B |
| A1234 | Bill | Jones | 4 | B |
| B5678 | Jane | Smith | 2 | A |
| B5678 | Jane | Smith | 4 | A |
| B5678 | Jane | Smith | 6 | A |
| B5678 | Jane | Smith | 9 | A |
| B5678 | Jane | Smith | 1 | B |
| B5678 | Jane | Smith | 5 | B |
-------------------------------------------------------------
My first instinct was to use UNPIVOT but I am not able to get it to work on two columns. Is there another method I should be using? Thank you in advance.

You don't need pivoting. You need string splitting:
select t.ID, t.FirstName, t.LastName, v.*
from t cross apply
(select 'A' as DataPointType, a.value as DataPoint
from string_split(t.DataPointA, ',') a
union all
select 'B' as DataPointType, b.value as DataPoint
from string_split(t.DataPointB, ',') b
) ab;
string_split() is only available in the most recent versions of SQL Server. In older versions, you can use your own split function, which can readily be found on the web.

Related

Teradata SQL code to join against a string

I have a table A that has the below values
+----+----------+-----------------------+
| ID | Date | Name |
+----+----------+-----------------------+
| 1 | 1/4/2019 | Kara,Sara,John |
| 2 | 3/2/2018 | Sara |
| 3 | 4/3/2019 | Lynn,John,Chris,Agnes |
| 4 | 2/1/2020 | Phillip, Anton |
| 5 | 5/1/2020 | Quinn |
| 6 | 7/6/2020 | Idie,John |
+----+----------+-----------------------+
And a table B that has the below values
+-------+
| Name |
+-------+
| John |
| Sara |
| Chris |
+-------+
I would like the output to be as below:
+----+----------+-----------------------+--------+-----------------+
| ID | Date | Name | B.Name | Exists in List? |
+----+----------+-----------------------+--------+-----------------+
| 1 | 1/4/2019 | Kara,Sara,John | Sara | Yes |
| 1 | 1/4/2019 | Kara, Sara, John | John | Yes |
| 2 | 3/2/2018 | Sara | Sara | Yes |
| 3 | 4/3/2019 | Lynn,John,Chris,Agnes | John | Yes |
| 3 | 4/3/2019 | Lynn,John,Chris,Agens | Chris | Yes |
| 4 | 2/1/2020 | Phillip, Anton | | No |
| 5 | 5/1/2020 | Quinn | | No |
| 6 | 7/6/2020 | Idie,John | John | Yes |
+----+----------+-----------------------+--------+-----------------+
I tried using CONTAINS but looks like teradata sql does not accept it. Tried CSVLD to convert text to column.However since there is no fixed number of commas that the string can accept, I cannot use CSVLD function if I do not know precisely how many columns I need to re-create from the text beforehand.
Wondering if there is any alternative to join a column against a string of values? Appreciate your kind input.

You should really fix your data model -- storing multiple values in a string is a bad, bad data design. SQL has a great way of storing lists -- it is called a table.
Assuming you are stuck with someone else's really, really bad data model, you can use a left join:
select a.*, b.name,
(case when b.name is not null then 'Yes' else 'No' end) as in_list
from a left join
b
on ',' || a.name || ',' like '%,' || b.name || ',%';

Updating table based on the results of previous query

How can I update the table based on the results of the previous query?
The original query (big thanks to GMB) can find any items in address (users table) that have a match in address (address_effect table).
From the result of this query, I want to find the count of address in the address_effect table and add it into a new column in the table “users”. For example, john doe has a match with idaho and usa in the address column so it’ll show a count of ‘2’ in the count column.
Fyi, I'm testing this on my local system with XAMPP (using MariaDB).
user table
+--------+-------------+---------------+--------------------------+--------+
| ID | firstname | lastname | address | count |
| | | | | |
+--------------------------------------------------------------------------+
| 1 | john | doe |james street, idaho, usa | |
| | | | | |
+--------------------------------------------------------------------------+
| 2 | cindy | smith |rollingwood av,lyn, canada| |
| | | | | |
+--------------------------------------------------------------------------+
| 3 | rita | chatsworth |arajo ct, alameda, cali | |
| | | | | |
+--------------------------------------------------------------------------+
| 4 | randy | plies |smith spring, lima, peru | |
| | | | | |
+--------------------------------------------------------------------------+
| 5 | Matt | gwalio |park lane, atlanta, usa | |
| | | | | |
+--------------------------------------------------------------------------+
address_effect table
+---------+----------------+
|address |effect |
+---------+----------------+
|idaho |potato, tater |
+--------------------------+
|canada |cold, tundra |
+--------------------------+
|fremont | crowded |
+--------------------------+
|peru |alpaca |
+--------------------------+
|atlanta |peach, cnn |
+--------------------------+
|usa |big, hard |
+--------+-----------------+

Use a correlated subquery which returns the number of matches:
UPDATE user u
SET u.count = (
SELECT COUNT(*)
FROM address_effect a
WHERE FIND_IN_SET(a.address, REPLACE(u.address, ', ', ','))
)
See the demo.
Results:
> ID | firstname | lastname | address | count
> -: | :-------- | :--------- | :------------------------- | ----:
> 1 | john | doe | james street, idaho, usa | 2
> 2 | cindy | smith | rollingwood av,lyn, canada | 1
> 3 | rita | chatsworth | arajo ct, alameda, cali | 0
> 4 | randy | plies | smith spring, lima, peru | 1
> 5 | Matt | gwalio | park lane, atlanta, usa | 2

Notice: I checked it in MySQL, but not in MariaDB.
The count column of users table may be able to be updated using UPDATE statement with INNER JOIN. Then you can use a query that modifies the original query to use "GROUP BY".
UPDATE users AS u
INNER JOIN
(
-- your original query modified
SELECT u.ID AS ID, count(u.ID) AS count
FROM users u
INNER JOIN address_effect a
ON FIND_IN_SET(a.address, REPLACE(u.address, ', ', ','))
GROUP BY u.ID
) AS c ON u.ID=c.ID
SET u.count=c.count;

SQL, query to check and list distinct entries that occur in another table within a specific time frame

I'm using Oracle.
I have two tables. One contains users and the other is an access log of sorts. I need to list all users whose latest log entry appears in the log within a specified time frame including the timestamp of the latest entry. A single user can have several entries in the log.
Here are simplified versions of the tables:
Users
|----------------------------------|
| userid| username | name |
|----------------------------------|
| 1 | josm | John Smith |
| 2 | lajo | Laura Jones |
| 3 | miwi | Mike Williams |
| 4 | subo | Susan Brown |
| 5 | peda | Peter Davis |
| 6 | jami | Jane Miller |
|----------------------------------|
Log
|----------------------------------|
| userid| action | timestamp |
|----------------------------------|
| 3 | a | 20-01-2020 |
| 2 | v | 19-11-2019 |
| 2 | y | 02-11-2019 |
| 4 | b | 15-09-2019 |
| 1 | a | 23-05-2019 |
| 6 | y | 22-05-2019 |
| 3 | b | 16-04-2019 |
| 2 | a | 07-01-2019 |
| 5 | v | 18-11-2018 |
| 6 | a | 12-09-2018 |
|----------------------------------|
Desired result if the time frame is set to last six months:
|---------------------------------------|
| username | name | timestamp |
|--------------------------|------------|
| miwi | Mike Williams | 20-01-2020 |
| lajo | Laura Jones | 19-11-2019 |
| subo | Susan Brown | 15-09-2019 |
|---------------------------------------|
Any help will be greatly appreciated.

You can use aggregation:
select u.username, u.userid, max(l.timestamp)
from logs l join
users u
on l.userid = u.userid
group by u.username, u.userid
having max(l.timestamp) >= add_months(sysdate, -6)

group by SQL properly

I have a table that stores the names and values on separate rows for work that takes place on a location like this below.
+--------+------------+--------+------------+----------+
| WorkID | Attribute | Value | Chagedby | Date |
+--------+------------+--------+------------+----------+
| 1 | Unit Name | Unit 1 | John Smith | Jan-2018 |
| 1 | Unit Value | OK | John Smith | Jan-2018 |
| 2 | Unit Name | Unit 2 | John Smith | Feb-2018 |
| 2 | Unit Value | Not Ok | John Smith | Feb-2018 |
| 3 | Unit Name | Unit 3 | John Smith | Mar-2018 |
| 3 | Unit Value | OK | John Smith | Mar-2018 |
+--------+------------+--------+------------+----------+
I have a query on this table that joins other tables and the output looks like this.
+--------+--------------+--------------------+----------------------+------------+----------+
| WorkID | Location | Value when unit ID | Value when ok/not ok | Chagedby | Date |
+--------+--------------+--------------------+----------------------+------------+----------+
| 1 | Springfield | Unit 1 | NULL | John Smith | Jan-2018 |
| 1 | Springfield | NULL | OK | John Smith | Jan-2018 |
| 2 | Shelbyville | Unit 2 | NULL | John Smith | Feb-2018 |
| 2 | Shelbyville | NULL | Not Ok | John Smith | Feb-2018 |
| 3 | Capital City | Unit 3 | NULL | John Smith | Mar-2018 |
| 3 | Capital City | NULL | OK | John Smith | Mar-2018 |
+--------+--------------+--------------------+----------------------+------------+----------+
what ends up happneing is the attribute "Value" is either the Name of my unit or the result of the test. how do i group this so it shows up on the same line.
+--------+--------------+--------------------+----------------------+------------+----------+
| WorkID | Location | Value when unit ID | Value when ok/not ok | Chagedby | Date |
+--------+--------------+--------------------+----------------------+------------+----------+
| 1 | Springfield | Unit 1 | OK | John Smith | Jan-2018 |
| 2 | Shelbyville | Unit 2 | Not OK | John Smith | Feb-2018 |
| 3 | Capital City | Unit 3 | OK | John Smith | Mar-2018 |
+--------+--------------+--------------------+----------------------+------------+----------+

I would join the table to itself, filtering once by name, the other one by value, as in:
select
a.workid,
a.value as name,
v.value as value,
a.changedby,
a.date
from my_table a
left join my_table v on a.workid = v.workid
where a.attribute = 'Unit Name'
and v.attribute = 'Unit Value'
I added a left join to include attributes that don't have a value yet.

Use max of column and if null use min

I have a table with 10 milestones in the column milestone. The column milestone_achieved has either the value OK or NULL.
The name column has just names, whenever someone new enters, all the milestones are entered in the database with NULL.
Here is what a typical table looks like:
+------+-----------+--------------------+
| name | milestone | milestone_achieved |
+------+-----------+--------------------+
| John | 1 | OK |
| John | 2 | OK |
| John | 3 | NULL |
| John | 4 | NULL |
| John | 5 | NULL |
| John | 6 | NULL |
| Mary | 1 | OK |
| Mary | 2 | OK |
| Mary | 3 | OK |
| Mary | 4 | OK |
| Mary | 5 | OK |
| Mary | 6 | OK |
| Tim | 1 | NULL |
| Tim | 2 | NULL |
| Tim | 3 | NULL |
| Tim | 4 | NULL |
| Tim | 5 | NULL |
| Tim | 6 | NULL |
+------+-----------+--------------------+
Now I want the SQL query to return:
+------+-----------+--------------------+
| name | milestone | milestone_achieved |
+------+-----------+--------------------+
| John | 2 | OK |
| Mary | 6 | OK |
| Tim | 1 | NULL |
+------+-----------+--------------------+
My query right now looks like this:
SELECT name, MAX(milestone) FROM table HAVING milestone_achieved = 'OK' GROUP BY name
UNION ALL
SELECT name, MIN(milestone) FROM table HAVING milestone_achieved IS NULL AND MIN(milestone) = 1 GROUP BY name
This works in 90% of the cases, the problem occurs when e.g. milestone 1 and 2 was completed, but then milestone 1 was "uncompleted" because it didn't fit the specific criteria or whatever (imagine an assembly line where cars are assembled and a screw =milestone 1 isn't tight enough but the paint =milestone 2 is already on it or whatever else you can imagine, I have terrible imagination).
I am now looking fo a way to properly display those 10% cases.

One method is:
SELECT name, MAX(milestone)
FROM table
WHERE milestone_achieved = 'OK'
GROUP BY name
UNION ALL
SELECT name, MIN(milestone)
FROM table
GROUP BY name
HAVING MIN(milestone_achieved) IS NULL;
This follows the structure of your logic. You can do this with one SELECT:
SELECT name,
COALESCE(MAX(CASE WHEN milestone_achieved = 'OK' THEN milestone END),
MIN(milestone)
)
FROM table
GROUP BY name

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server - Pivot Out Delimited Column Data Into Rows - sql

Related

Teradata SQL code to join against a string

Updating table based on the results of previous query

SQL, query to check and list distinct entries that occur in another table within a specific time frame

group by SQL properly

Use max of column and if null use min

Categories

Resources