Ordering 2 columns on the same order - sql

I have the table:
Example:
Name | Last Name
Albert Rigs
Carl Dimonds
Robert Big
Julian Berg
I need to order like this:
Name | Last Name
Albert Rigs (name)
Julian Berg (last name)
Robert Big (last name)
Carl Dimonds
I need something like, order by name and last name on the same ordering.
See on example, i have Name Albert, the next ordered name row its the Carl, but i have Big and Berg on last name, B > C so i get the last name order on second row.
It's like the two columns are the same but isn't.
It's hard to explaim, i'm sorry.
Its possible?
Thaks in advance.

To order by the minimum of (Name, Lastname), you could:
select *
from YourTable
order by
case
when Name > LastName then LastName
else name
end

A syntactic improvement on the Case, and allowing a ti-break on the other column.
select *
from my_table
order by least(name,last_name),
greatest(name,last_name)

Related

How do you merge duplicate rows in a table in BigQuery - replacing missing values with most recent records

For example, I have a table of leads from a marketing database. There are multiple records with duplicate email values. I'd like to merge all of the duplicate records to roll up into the latest updated record and if the latest updated record is missing values for certain fields then update those fields from other records most recently updated.
Table:
First
last
Email
Phone
Job Title
State
Last Updated
John
Doe
john.doe#example.com
MD
1/1/2019
John
low
john.doe#example.com
1234567891
Coach
VA
1/1/2018
John
Doe
john.doe#example.com
3214569875
Teacher
CA
1/1/2017
Andy
Yes
john.doe#example.com
DC
1/1/2021
Roby
Doe
john.doe#example.com
8628423578
Scientist
VA
1/1/2025
Output - One record:
First
last
Email
Phone
Job Title
State
Last Updated
Andy
Yes
john.doe#example.com
1234567891
Coach
DC
1/1/2021
In this example, since the 2021 record is missing a phone number and job title, those values are pulled from the most recent updated records (2018).
I've thought about using Distinct or Unique functions but not sure how to execute on the merge using the last updated record and then filling in the blank values with the other most recent records. Any help would be greatly appreciated!!
Thank you in advance.
Best,
Dawit
Consider below approach - I think it is most generic - you need just make sure you have correct list of fields in unpivot and pivot lines. Though there is an assumption that following fields (First, Last, Phone, Job_Title, State) are all of string data type
select First, Last, Email, Phone, Job_Title, State, max_Last_Updated as Last_Updated
from (
select * except(Last_Updated),
max(Last_Updated) over(partition by Email) as max_Last_Updated
from data
unpivot (value for col in (First, Last, Phone, Job_Title, State))
where true
qualify row_number() over(partition by Email, col order by Last_Updated desc) = 1
)
pivot (max(value) for col in ('First', 'Last', 'Phone', 'Job_Title', 'State', 'Last_Updated'))
If applied to sample data in your question (excluding 2025 row) - output is
You need a method to know that these are all the same record. You can use last_value(ignore nulls) for this purpose:
select t.*,
last_value(first ignore nulls) over (partition by email order by last_updated) as imputed_first,
last_value(last ignore nulls) over (partition by email order by last_updated) as imputed_first,
. . . -- and so on for the other columns
from t;

How to get the differences between two rows **and** the name of the field where the difference is, in BigQuery?

I have a table in BigQuery like this:
Name
Phone Number
Address
John
123456778564
1 Penny Lane
John
873452987424
1 Penny Lane
Mary
845704562848
87 5th Avenue
Mary
845704562848
54 Lincoln Rd.
Amy
342847327234
4 Ocean Drive Avenue
Amy
347907387469
98 Truman Rd.
I want to get a table with the differences between two consecutive rows and the name of the field where occurs the difference:
I mean this:
Name
Field
Before
After
John
Phone Number
123456778564
873452987424
Mary
Address
87 5th Avenue
54 Lincoln Rd.
Amy
Phone Number
342847327234
347907387469
Amy
Address
4 Ocean Drive Avenue
98 Truman Rd.
How can I do this ? I've looked on other posts but couldn't find something that corresponds to my need.
Thank you
Consider below BigQuery'ish solution
select Name, ['Phone Number', 'Address'][offset(offset)] Field,
prev_field as Before, field as After
from (
select timestamp, Name, offset, field,
lag(field) over (partition by Name, offset order by timestamp) as prev_field
from yourtable,
unnest([`Phone Number`, Address]) field with offset
)
where prev_field != field
if applied to sample data in your question - output is
As you can see here - no matter how many columns in your table that you need to compare - it is still just one query - no unions and such.
You just need to enumerate your columns in two places
['Phone Number', 'Address'][offset(offset)] Field
and
unnest([`Phone Number`, Address]) field with offset
Note: you can further refactor above using scripting's execute immediate to compose such lists within the query on the fly (check my other answers - I frequently use such technique in them)
One method is just use to use lag() and union all
select name, 'phone', prev_phone as before, phone as after
from (select name, phone,
lag(phone) over (partition by name order by timestamp) as prev_phone
from t
) t
where prev_phone <> phone
union all
select name, 'address', prev_address as before, address as afte4r
from (select name, address,
lag(address) over (partition by name order by timestamp) as prev_address
from t
) t
where prev_address <> address

Grouping values and changing values which do not allow the rest of the row to group

Not sure how to describe this, but I want to group a row of values, where one field has two or more different values and set the value of that (but concatenating or changing the values) to give just one single row.
For example:
I have a simple table (all fields are Strings) of people next to their departments. But some people belong to more than one department.
select department_ind, name
from jobs
;
department_ind name
1 Michael
2 Michael
2 Sarah
3 Dave
2 Sally
4 Sally
I want to group by name, and concatenate the department_ind. So the results show look like:
department_ind name
1,2 Michael
2 Sarah
3 Dave
2,4 Sally
Thanks
Use string_agg()
select string_agg(department_ind::text, ',') as departments,
name
from jobs
group by name;

displaying sql results by a group based on column

I have in my table, say thousands of records. I want to display records together by city. It's a lot more complicated then that, since I need it displayed in alphabetical order as well based on customer name. How do I achieve this? Group BY seems to want to give me a total instead of displaying each of my records. so..
mark zuck some city
john smith cherryville
bill gates some city
jane doe cherryville
should return
bill gates some city
mark zuck some city
jane doe cherryville
john smith cherryville
This is an over-simplification but the idea stands. I appreciate all the help. thank you!
Group by is for aggregations. There is no aggregation in your query. You just want your output to be sorted. In this case, Order By well fits for the purpose.
select * from table1
order by city, customer
In english, get all table1 data sorted by first city, then customer

Is there a way to select results after a certain id in an order list?

I'm trying to implement a cursor-based paginating list based off of data from a Postgres database.
As an example, say I have a table with the following columns:
id | firstname | lastname
I want to paginate this data, which would be pretty simple if I only ever wanted to sort it by the id, but in my case, I want the option to sort by last name, and there's guaranteed to be multiple people with the same last name.
If I have a select statement like follows:
SELECT * FROM people
ORDER BY lastname ASC;
In the case, I could make my encoded cursor contain information about the lastname so I could pick up where I left off, but since there will be multiple users with the same last name, this will be buggy. Is there a way in SQL to only get the results after a certain id in an ordered list where it is not the column by which the results are sorted?
Example results from the select statement:
1 | John | Doe
4 | John | Price
2 | Joe | White
6 | Jim | White
3 | Sam | White
5 | Sally | Young
If I wanted a page size of 3, I couldn't add WHERE lastname <= :lastname as I'd have duplicate data on the list since it would return ids 2, 6, and 3 during that call. In my case, it'd be helpful if I could add to my query something similar to AFTER id = 6 where it could skip everything until it finds that id in the ordered list.
Yes. If I understand correctly:
select t.*
from t
where (lastname, id) > (select t2.lastname, t2.id
from t t2
where t2.id = ?
)
order by t.lastname;
I think I would add firstname into the mix, but it is the same idea.
Limit and offset are used for pagination e.g.:
SELECT id, lastname, firstname FROM people
Order by lastname, firstname, id
Offset 0
Limit 10
This will bring you the first to the 10th row, to retrieve the next page you need to specify the offset to 10
Here the documentation:
https://www.postgresql.org/docs/9.6/static/queries-limit.html