How to combine multiple rows into one line [duplicate] - sql

This question already has answers here:
SQL Query to concatenate column values from multiple rows in Oracle
(10 answers)
Closed 4 years ago.
I have a very simple query:
select date,route,employee
from information
where date=Trunc(Sysdate)
however, since for some routes, there are more than 2 employees are assigned, so the query will return two rows
But I want get one route for one row, so the ideal output should be:
so the two names are in the same row, and combine with "|", so how can I achieve this goal in PL/SQL?

You can use listagg function, but you have to add Date and Route to grouping functions as well
SELECT LISTAGG(emp, ' | ')
WITHIN GROUP (ORDER BY emp) "Emp",
MAX(date) "Date",
MAX(route) "Route"
FROM information
WHERE date=Trunc(Sysdate);

Related

I want to collect duplicated values in SQL and convert them to a number [duplicate]

This question already has answers here:
How to use count and group by at the same select statement
(11 answers)
Closed 6 months ago.
I want to count a certain value in a column and output it as a number.
Here is an example:
id
job
1
police
2
police
3
ambulance
Now I want to count the value "police" in the "job" column and make the result in a number, so because there are two entries with "police" in the column it would be as output the number two. With the value "ambulance" it is only one entry so the result would be 1.
Can anyone tell me how to write this as code?
I have now searched a lot on the Internet and tried myself but I have found nothing that worked.
You're saying you want to count how many of each type of job there is, right?
SELECT COUNT(*), job
FROM tablename
GROUP BY job

It's possible to select distinct and no distinct in Pyspark? [duplicate]

This question already has answers here:
How to get distinct rows in dataframe using pyspark?
(2 answers)
Closed 2 years ago.
I need to select 2 columns from a fact table (attached below). The problem I find is that for one of the columns I need unique values and for the other one I'm happy to have them duplicated as they below to a specific ticket id.
Fact table used:
df = (
spark.table(f'nn_table_{country}.fact_table')
.filter(f.col('date_key').between(start_date,end_date))
.filter(f.col('is_client_plus')==1)
.filter(f.col('source')=='tickets')
.filter(f.col('subtype')=='item_pm')
.filter(f.col('external_id')=='DISC0000077144 | DISC0000076895')
.filter(f.col('external_id').isNotNull())
.select('customer_id','external_id').distinct()
#.join(dim_promotions, 'external_id', 'left')
)
display(df)
As you can see, the select statement contains a customer_id and external_id column, where I'm only interested in get the unique customer_id.
.select('customer_id','external_id').distinct()
Desired output:
customer_id external_id
77000000505097070 DISC0000077144
77000002294023644 DISC0000077144
77000000385346302 DISC0000076895
77000000291101490 DISC0000076895
any idea about how to do that? or if it's possible?
Thanks in advance!
Use dropDuplicates:
df.select('customer_id','external_id').dropDuplicates(['customer_id'])

How to get the list latest entries from the database table for list of values for the specific column? [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Select First Row of Every Group in sql [duplicate]
(2 answers)
Return row with the max value of one column per group [duplicate]
(3 answers)
Get value based on max of a different column grouped by another column [duplicate]
(1 answer)
GROUP BY with MAX(DATE) [duplicate]
(6 answers)
Closed 2 years ago.
I have a database table where there is a column hostname and few other columns.
in hostname, there are many instances of rows with same hostnames.
Ex. 192.0.0.1 has 40 entries and 192.0.0.2 has 35 entries and so on.
Is it possible to get the list of latest entries for each hostname? meaning in the result i should get one latest for 192.0.0.1 and one latest for 192.0.0.2 and so on.
I tried with
SELECT host,cluster,region,service
FROM system
WHERE host IN ("171.33.64.158","171.33.64.159")
ORDER BY id DESC LIMIT 1;
but the output contains only one row which is probably the first row of the entire result.
Can anyone please help me in this requirement?
Thanks,
Swapnil
Try this:
select host,cluster,region,service
from (
SELECT host,cluster,region,service, row_number() over (partition by host order by id
desc) as r_num
FROM system
WHERE host IN ("171.33.64.158","171.33.64.159"))
where r_num = 1;
Thanks
For proper ordering you need to convert the IP address into decimal equivalent. I would suggest a function like this:
FUNCTION IP2Decimal(IP IN VARCHAR2) RETURN NUMBER DETERMINISTIC IS
DecimalIp NUMBER;
BEGIN
SELECT SUM(REGEXP_SUBSTR(IP, '\d+', 1, LEVEL) * POWER(256, 4-LEVEL))
INTO DecimalIp
FROM dual
CONNECT BY LEVEL <= 4;
RETURN DecimalIp;
END IP2Decimal;
Then your query could be this:
SELECT hostname,
FIRST_VALUE(host) OVER (ORDER BY IP2Decimal(host)),
FIRST_VALUE(cluster) OVER (ORDER BY IP2Decimal(host)),
FIRST_VALUE(region) OVER (ORDER BY IP2Decimal(host)),
FIRST_VALUE(service) OVER (ORDER BY IP2Decimal(host))
FROM system
GROUP BY hostname;

Concatenate multiple rows to one row [duplicate]

This question already has answers here:
Simulating group_concat MySQL function in Microsoft SQL Server 2005?
(12 answers)
SQL Server: Combine multiple rows into one row from a join table?
(1 answer)
ListAGG in SQLSERVER
(4 answers)
how to stuff date and time in SQL
(2 answers)
Query to get multiple row into single row
(3 answers)
Closed 3 years ago.
I have the below data and I have to concatenate the long text column and make it as a single row. The challenge part is only one row has the notification number and other rows are null. You cannot group by the notification number.
I need the output as 2 rows
row number Notification Number Plant Creation Date Language Lineno Tag Long Text
1 10014354914 A057 43466 EN 1 >X aaabbbcccdddeeefffggghhhjjjkkklll
2 10014354915 A057 43466 EN 1 >X aaabbbcccdddeeefffgggpppqqqrrrsss
I have used cursor for this. But it is taking much time.
If you are using oracle:
with data("row number", "Notification Number","Plant","Creation Date","Language","Lineno","Tag","Long Text") as (
select 1,10014354914,'A057',43466,'EN',1,'>X','aaabbbcccdddeeefffggghhhjjjkkklll' from dual
union all
select 2,10014354915,'A057',43466,'EN',1,'>X','aaabbbcccdddeeefffgggpppqqqrrrsss' from dual)
select LISTAGG("Long Text",'') within group (order by "row number") from data;
if you are using ms-sql maybe try this:
SELECT u.[Long Text] AS [text()]
FROM yourtable u
ORDER BY u.[row number]
FOR XML PATH ('')

BigQuery split column and get count of count each substring [duplicate]

This question already has answers here:
How can I compute TF/IDF with SQL (BigQuery)
(3 answers)
Closed 4 years ago.
In BigQuery I would like to create a query to count the occurrence of words in a comments field and group by a count of each occurrence. This would me get a sense of what words were used more than others and get a sense of user behavior and moods. Pretty new to bigquery so any ideas will be helpful.
What I ended up doing was using the split function...
SELECT
COUNT(JJ) AS STUFF, JJ
FROM
(SELECT SPLIT(text, ' ') AS JJ FROM [bigquery-public-
data:hacker_news.comments] LIMIT 1000 )
GROUP BY JJ
ORDER BY STUFF DESC
LIMIT 5
Obviously it can be manipulated more by using replace to remove other characters before splitting.