Assign unique ID's to three tables in SELECT query, ID's should not overlap - sql

I am working on SQL Sever and I want to assign unique Id's to rows being pulled from those three tables, but the id's should not overlap.
Let's say, Table one contains cars data, table two contains house data, table three contains city data. I want to pull all this data into a single table with a unique id to each of them say cars from 1-100, house from 101 - 200 and city from 300- 400.
How can I achieve this using only select queries. I can't use insert statements.
To be more precise,
I have one table with computer systems/servers host information which has id from 500-700.
I have another tables, storage devices (id's from 200-600) and routers (ids from 700-900). I have already collected systems data. Now I want to pull storage systems and routers data in such a way that the consolidated data at my end should has a unique id for all records. This needs to be done only by using SELECT queries.
I was using SELECT ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS UniqueID and storing it in temp tables (separate for storage and routers). But I believe that this may lead to some overlapping. Please suggest any other way to do this.
An extension to this question:
Creating consistent integer from a string:
All I have is various strings like this
String1
String2Hello123
String3HelloHowAreYou
I Need to convert them in to positive integers say some thing like
String1 = 12
String2Hello123 = 25
String3HelloHowAreYou = 4567
Note that I am not expecting the numbers in any order.Only requirement is number generated for one string should not conflict with other
Now later after the reboot If I do not have 2nd string instead there is a new string
String1 = 12
String3HelloHowAreYou = 4567
String2Hello123HowAreyou = 28
Not that the number 25 generated for 2nd string earlier can not be sued for the new string.
Using extra storage (temp tables) is not allowed

if you dont care where the data comes from:
with dat as (
select 't1' src, id from table1
union all
select 't2' src, id from table2
union all
select 't3' src, id from table3
)
select *
, id2 = row_number() over( order by _some_column_ )
from dat

Related

Bigquery join 2 tables with id concated from 4 columns and create a new table dynamically

I have two tables in Bigquery from two different data sources, lets say x and y. I want to join these two tables on os_name, tracker_name, date, country columns. For that i am using concat function and joining like this:
full outer join x on concat(x.date,x.os_name,x.tracker_name, x.country) = concat(y.date,y.os_name,y.tracker_name,y.country_code)
as a query result common columns also gets duplicated. like in the result there is os_name and os_name_1, country_code, country_code_1 etc. columns. I don't want that. Final columns should be as in the example below in Final Table Schema.
I want to return all records from both sides. For example if there is no match in table y
y_install, and y_purcase will be 0, and vice versa.
X TABLE SCHEMA:
os_name,
tracker_name,
date ,
country
install
purchase
Y TABLE SCHEMA:
os_name,
tracker_name,
date,
country,
y_install,
y_purchase
Final Table Schema required:
os_name,
tracker_name,
date ,
country
install
purchase,
y_install,
y_purchase
I am going to schedule the query and write results to destination table at given interval.
Can you help me out with this query.
Regarding the final table, I don't understand whether you want to return first NON NULL result or whether you want to have e.g. an array which will contain both results from both tables in case both tables a valid value. In my sample table, do you want row 1,2 (actually the same thing) or 3?
row_number
x_install
y_install
final_table_install
1
23
50
23
2
NULL
50
50
3
23
50
[23,50]
It comes out that What I wanted to use was union all. First, I added the non-common columns to the two tables so that the schemas of the two tables are equal. So I was able to vertically merge tables using union all. Thanks for trying to help out anyway.

How to aggregate data stored column-wise in a matrix table

I have a table, Ellipses (...), represent multiple columns of a similar type
TABLE: diagnosis_info
COLUMNS: visit_id,
patient_diagnosis_code_1 ...
patient_diagnosis_code_100 -- char(100) with a value of ‘0’ or ‘1’
How do I find the most common diagnosis_code? There are 101 columns including the visit_id. The table is like a matrix table of 0s and 1s. How do I write something that can dynamically account for all the columns and count all the rows where the value is 1?
What I would normally do is not feasable as there are too many columns:
SELECT COUNT(patient_diagnostic_code_1), COUNT(patient_diagnostic_code_2),... FROM diagnostic_info WHERE patient_diagnostic_code_1 = ‘1’ and patient_diagnostic_code_2 = ‘1’ and ….
Then even if I typed all that out how would I select which column had the highest count of values = 1. The table is more column oriented instead of row oriented.
Unfortunately your data design is bad from the start. Instead it could be as simple as:
patient_id, visit_id, diagnosis_code
where a patient with 1 dignostic code would have 1 row, a patient with 100 diagnostic codes 100 rows and vice versa. At any given time you could transpose this into the format you presented (what is called a pivot or cross tab). Also in some databases, for example postgreSQL, you could put all those diagnostic codes into an array field, then it would look like:
patient_id, visit_id, diagnosis_code (data type -bool or int- array)
Now you need the reverse of it which is called unpivot. On some databases like SQL server there is UNPIVOT as an example.
Without knowing what your backend this, you could do that with an ugly SQL like:
select code, pdc
from
(
select 1 as code, count(*) as pdc
from myTable where patient_diagnosis_code_1=1
union
select 2 as code, count(*) as pdc
from myTable where patient_diagnosis_code_2=1
union
...
select 100 as code, count(*) as pdc
from myTable where patient_diagnosis_code_100=1
) tmp
order by pdc desc, code;
PS: This would return all the codes with their frequency ordered from most to least. You could limit to get 1 to get the max (with ties in case there are more than one code to match the max).

Oracle 11 SQL lists and two table comparison

I have a list of 12 strings (strings of numbers) that I need to compare to an existing table in Oracle. However I don't want to create a table just to do the compare; that takes more time than it is worth.
select
column_value as account_number
from
table(sys.odcivarchar2list('27001', '900480', '589358', '130740',
'807958', '579813', '1000100462', '656025',
'11046', '945287', '18193', '897603'))
This provides the correct result.
Now I want to compare that list of 12 against an actual table with account numbers to find the missing values. Normally with two tables I would do a left join table1.account_number = table2.account_number and table two results will have blanks. When I attempt that using the above, all I get are the results where the two records are equal.
select column_value as account_number, k.acct_num
from table(sys.odcivarchar2list('27001',
'900480',
'589358',
'130740',
'807958',
'579813',
'1000100462',
'656025',
'11046',
'945287',
'18193',
'897603'
)) left join
isi_owner.t_known_acct k on column_value = k.acct_num
9 match, but 3 should be included in table1 and blank in table2
Thoughts?
Sean

Random sample table with Hive, but including matching rows

I have a large table containing a userID column and other user variable columns, and I would like to use Hive to extract a random sample of users based on their userID. Furthermore, sometimes these users will be on multiple rows and if a randomly selected userID is contained in other parts of the table I would like to extract those rows too.
I had a look at the Hive sampling documentation and I see that something like this can be done to extract a 1% sample:
SELECT * FROM source
TABLESAMPLE (1 PERCENT) s;
but I am not sure how to add the constraint where I would like all other instances of those 1% userIDs selected too.
You can use rand() to split the data randomly and with the proper percent of userid in your category. I recommend rand() because setting the seed to something make the results repeatable.
select c.*
from
(select userID
, if(rand(5555)<0.1, 'test','train') end as type
from
(select userID
from mytable
group by userID
) a
) b
right outer join
(select *
from userID
) c
on a.userid=c.userid
where type='test'
;
This is set up for entity level modeling purposes, which is why I have test and train as types.

How can I retrieve similar data from two separate tables simultaneously?

Disclaimer: my SQL skills are basic, to say the least.
Let's say I have two similar data types in different tables of the same database.
The first table is called hardback and the fields are as follows:
hbID | hbTitle | hbPublisherID | hbPublishDate
The second table is called paperback and its fields hold similar data but the fields are named differently:
pbID | pbTitle | pbPublisherID | pbPublishDate
I need to retrieve the 10 most recent hardback and paperback books, where the publisher ID is 7.
This is what I have so far:
SELECT TOP 10
hbID, hbTitle, hbPublisherID, hbPublishDate AS pDate
bpID, pbTitle, bpPublisherID, pbPublishDate AS pDate
FROM hardback CROSS JOIN paperback
WHERE (hbPublisherID = 7) OR (pbPublisherID = 7)
ORDER BY pDate DESC
This returns seven columns per row, at least three of which may or may not be for the wrong publisher. Possibly four, depending on the contents of pDate, which is almost certainly going to be a problem if the other six columns are for the correct publisher!
In an effort to release an earlier version of this software, I ran two separate queries fetching 10 records each, then sorted them by date and discarded the bottom ten, but I just know there must be a more elegant way to do it!
Any suggestions?
Aside: I was reviewing what I'd written here, when my Mac suddenly experienced a kernel panic. Restarted, reopened my tabs and everything I'd typed was still here! Stack Exchange sites are awesome :)
The easiest way is probably a UNION:
SELECT TOP 10 * FROM
(SELECT hbID, hbTitle, hbPublisherID as PublisherID, hbPublishDate as pDate
FROM hardback
UNION
SELECT hpID, hpTitle, hpPublisher, hpPublishDate
FROM paperback
) books
WHERE PublisherID = 7
If you could have two copies of the same title (1 paperback, 1 hardcover), change the UNION to a UNION ALL; UNION alone discards duplicates. You could also add a column that indicates what book type it is by adding a pseudo-column to each select (after the publish date, for instance):
hbPublishDate as pDate, 'H' as Covertype
You'll have to add the same new column to the paperback half of the query, using 'P' instead. Note that on the second query you don't have to specify column names; the resultset takes the names from the first one. All column data types in the two queries have match, also - you can't UNION a date column in the first with a numeric column in the second without converting the two columns to the same datatype in the query.
Here's a sample script for creating two tables and doing the select above. It works just fine in SQL Server Management Studio.Just remember to drop the two tables (using DROP Table tablename) when you're done.
use tempdb;
create table Paperback (pbID Integer Identity,
pbTitle nvarchar(30), pbPublisherID Integer, pbPubDate Date);
create table Hardback (hbID Integer Identity,
hbTitle nvarchar(30), hbPublisherID Integer, hbPubDate Date);
insert into Paperback (pbTitle, pbPublisherID, pbPubDate)
values ('Test title 1', 1, GETDATE());
insert into Hardback (hbTitle, hbPublisherID, hbPubDate)
values ('Test title 1', 1, GETDATE());
select * from (
select pbID, pbTitle, pbPublisherID, pbPubDate, 'P' as Covertype
from Paperback
union all
select hbID, hbTitle, hbPublisherID, hbPubDate,'H'
from Hardback) books
order by CoverType;
/* You'd drop the two tables here with
DROP table Paperback;
DROP table HardBack;
*/
i think it is clearly better, if you make only one table with a reference to another one which holds information about the category of the entry like hardback or paperback. this is my first suggestion.
by the way, what is your programming language?