How can I retrieve similar data from two separate tables simultaneously? - sql-server-2000

Disclaimer: my SQL skills are basic, to say the least.
Let's say I have two similar data types in different tables of the same database.
The first table is called hardback and the fields are as follows:
hbID | hbTitle | hbPublisherID | hbPublishDate
The second table is called paperback and its fields hold similar data but the fields are named differently:
pbID | pbTitle | pbPublisherID | pbPublishDate
I need to retrieve the 10 most recent hardback and paperback books, where the publisher ID is 7.
This is what I have so far:
SELECT TOP 10
hbID, hbTitle, hbPublisherID, hbPublishDate AS pDate
bpID, pbTitle, bpPublisherID, pbPublishDate AS pDate
FROM hardback CROSS JOIN paperback
WHERE (hbPublisherID = 7) OR (pbPublisherID = 7)
ORDER BY pDate DESC
This returns seven columns per row, at least three of which may or may not be for the wrong publisher. Possibly four, depending on the contents of pDate, which is almost certainly going to be a problem if the other six columns are for the correct publisher!
In an effort to release an earlier version of this software, I ran two separate queries fetching 10 records each, then sorted them by date and discarded the bottom ten, but I just know there must be a more elegant way to do it!
Any suggestions?
Aside: I was reviewing what I'd written here, when my Mac suddenly experienced a kernel panic. Restarted, reopened my tabs and everything I'd typed was still here! Stack Exchange sites are awesome :)

The easiest way is probably a UNION:
SELECT TOP 10 * FROM
(SELECT hbID, hbTitle, hbPublisherID as PublisherID, hbPublishDate as pDate
FROM hardback
UNION
SELECT hpID, hpTitle, hpPublisher, hpPublishDate
FROM paperback
) books
WHERE PublisherID = 7
If you could have two copies of the same title (1 paperback, 1 hardcover), change the UNION to a UNION ALL; UNION alone discards duplicates. You could also add a column that indicates what book type it is by adding a pseudo-column to each select (after the publish date, for instance):
hbPublishDate as pDate, 'H' as Covertype
You'll have to add the same new column to the paperback half of the query, using 'P' instead. Note that on the second query you don't have to specify column names; the resultset takes the names from the first one. All column data types in the two queries have match, also - you can't UNION a date column in the first with a numeric column in the second without converting the two columns to the same datatype in the query.
Here's a sample script for creating two tables and doing the select above. It works just fine in SQL Server Management Studio.Just remember to drop the two tables (using DROP Table tablename) when you're done.
use tempdb;
create table Paperback (pbID Integer Identity,
pbTitle nvarchar(30), pbPublisherID Integer, pbPubDate Date);
create table Hardback (hbID Integer Identity,
hbTitle nvarchar(30), hbPublisherID Integer, hbPubDate Date);
insert into Paperback (pbTitle, pbPublisherID, pbPubDate)
values ('Test title 1', 1, GETDATE());
insert into Hardback (hbTitle, hbPublisherID, hbPubDate)
values ('Test title 1', 1, GETDATE());
select * from (
select pbID, pbTitle, pbPublisherID, pbPubDate, 'P' as Covertype
from Paperback
union all
select hbID, hbTitle, hbPublisherID, hbPubDate,'H'
from Hardback) books
order by CoverType;
/* You'd drop the two tables here with
DROP table Paperback;
DROP table HardBack;
*/

i think it is clearly better, if you make only one table with a reference to another one which holds information about the category of the entry like hardback or paperback. this is my first suggestion.
by the way, what is your programming language?

Related

How to design a SQL table where a field has many descriptions

I would like to create a product table. This product has unique part numbers. However, each part number has various number of previous part numbers, and various number of machines where the part can be used.
For example the description for part no: AA1007
Previous part no's: AA1001, AA1002, AA1004, AA1005,...
Machine brand: Bosch, Indesit, Samsun, HotPoint, Sharp,...
Machine Brand Models: Bosch A1, Bosch A2, Bosch A3, Indesit A1, Indesit A2,....
I would like to create a table for this, but I am not sure how to proceed. What I have been able to think is to create a table for Previous Part no, Machine Brand, Machine Brand Models individually.
Question: what is the proper way to design these tables?
There are of course various ways to design the tables. A very basic way would be:
You could create tables like below. I added the columns ValidFrom and ValidTill, to identify at which time a part was active/in use.
It depends on your data, if datatype date is enough, or you need datetime to make it more exactly.
CREATE TABLE Parts
(
ID bigint NOT NULL
,PartNo varchar(100)
,PartName varchar(100)
,ValidFrom date
,ValidTill date
)
CREATE TABLE Brands
(
ID bigint NOT NULL
,Brand varchar(100)
)
CREATE TABLE Models
(
ID bigint NOT NULL
,BrandsID bigint NOT NULL
,ModelName varchar(100)
)
CREATE TABLE ModelParts
(
ModelsID bigint NOT NULL
,PartID bigint NOT NULL
)
Fill your data like:
INSERT INTO Parts VALUES
(1,'AA1007', 'Screw HyperFuturistic', '2017-08-09', '9999-12-31'),
(1,'AA1001', 'Screw Iron', '1800-01-01', '1918-06-30'),
(1,'AA1002', 'Screw Steel', '1918-07-01', '1945-05-08'),
(1,'AA1004', 'Screw Titanium', '1945-05-09', '1983-10-05'),
(1,'AA1005', 'Screw Futurium', '1983-10-06', '2017-08-08')
INSERT INTO Brands VALUES
(1,'Bosch'),
(2,'Indesit'),
(3,'Samsung'),
(4,'HotPoint'),
(5,'Sharp')
INSERT INTO Models VALUES
(1,1,'A1'),
(2,1,'A2'),
(3,1,'A3'),
(4,2,'A1'),
(5,2,'A2')
INSERT INTO ModelParts VALUES
(1,1)
To select all parts of a certain date (in this case 2013-03-03) of the "Bosch A1":
DECLARE #ReportingDate date = '2013-03-03'
SELECT B.Brand
,M.ModelName
,P.PartNo
,P.PartName
,P.ValidFrom
,P.ValidTill
FROM Brands B
INNER JOIN Models M
ON M.BrandsID = B.ID
INNER JOIN ModelParts MP
ON MP.ModelsID = M.ID
INNER JOIN Parts P
ON P.ID = MP.PartID
WHERE B.Brand = 'Bosch'
AND M.ModelName = 'A1'
AND P.ValidFrom <= #ReportingDate
AND P.ValidTill >= #ReportingDate
Of course there a several ways to do an historization of data.
ValidFrom and ValidTill (ValidTo) is one of my favourites, as you can easily do historical reports.
Unfortunately you have to handle the historization: When inserting a new row - in example for your screw - you have to "close" the old record by setting the ValidTill column before inserting the new one. Furthermore you have to develop logic to handle deletes...
Well, thats a quite large topic. You will find tons of information in the world wide web.
For the part number table, you can consider the following suggestion:
id | part_no | time_created
1 | AA1007 | 2017-08-08
1 | AA1001 | 2017-07-01
1 | AA1002 | 2017-06-10
1 | AA1004 | 2017-03-15
1 | AA1005 | 2017-01-30
In other words, you can add a datetime column which versions each part number. Note that I added a primary key id column here, which is invariant over time and keeps track of each part, despite that the part number may change.
For time independent queries, you would join this table using the id column. However, the part number might also serve as a foreign key. Off the top of my head, if you were generating an invoice from a previous date, you might lookup the appropriate part number at that time, and then join out to one or more tables using that part number.
For the other tables you mentioned, I do not see a similar requirement.

SQL Server where condition on column with separated values

I have a table with a column that can have values separated by ",".
Example column group:
id column group:
1 10,20,30
2 280
3 20
I want to create a SELECT with where condition on column group where I can search for example 20 ad It should return 1 and 3 rows or search by 20,280 and it should return 1 and 2 rows.
Can you help me please?
As pointed out in comments,storing mutiple values in a single row is not a good idea..
coming to your question,you can use one of the split string functions from here to split comma separated values into a table and then query them..
create table #temp
(
id int,
columnss varchar(100)
)
insert into #temp
values
(1,'10,20,30'),
(2, '280'),
(3, '20')
select *
from #temp
cross apply
(
select * from dbo.SplitStrings_Numbers(columnss,',')
)b
where item in (20)
id columnss Item
1 10,20,30 20
3 20 20
The short answer is: don't do it.
Instead normalize your tables to at least 3NF. If you don't know what database normalization is, you need to do some reading.
If you absolutely have to do it (e.g. this is a legacy system and you cannot change the table structure), there are several articles on string splitting with TSQL and at least a couple that have done extensive benchmarks on various methods available (e.g. see: http://sqlperformance.com/2012/07/t-sql-queries/split-strings)
Since you only want to search, you don't really need to split the strings, so you can write something like:
SELECT id, list
FROM t
WHERE ','+list+',' LIKE '%,'+#searchValue+',%'
Where t(id int, list varchar(max)) is the table to search and #searchValue is the value you are looking for. If you need to search for more than one value you have to add those in a table and use a join or subquery.
E.g. if s(searchValue varchar(max)) is the table of values to search then:
SELECT distinct t.id, t.list
FROM t INNER JOIN s
ON ','+t.list+',' LIKE '%,'+s.searchValue+',%'
If you need to pass those search values from ADO.Net consider table parameters.

Assign unique ID's to three tables in SELECT query, ID's should not overlap

I am working on SQL Sever and I want to assign unique Id's to rows being pulled from those three tables, but the id's should not overlap.
Let's say, Table one contains cars data, table two contains house data, table three contains city data. I want to pull all this data into a single table with a unique id to each of them say cars from 1-100, house from 101 - 200 and city from 300- 400.
How can I achieve this using only select queries. I can't use insert statements.
To be more precise,
I have one table with computer systems/servers host information which has id from 500-700.
I have another tables, storage devices (id's from 200-600) and routers (ids from 700-900). I have already collected systems data. Now I want to pull storage systems and routers data in such a way that the consolidated data at my end should has a unique id for all records. This needs to be done only by using SELECT queries.
I was using SELECT ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS UniqueID and storing it in temp tables (separate for storage and routers). But I believe that this may lead to some overlapping. Please suggest any other way to do this.
An extension to this question:
Creating consistent integer from a string:
All I have is various strings like this
String1
String2Hello123
String3HelloHowAreYou
I Need to convert them in to positive integers say some thing like
String1 = 12
String2Hello123 = 25
String3HelloHowAreYou = 4567
Note that I am not expecting the numbers in any order.Only requirement is number generated for one string should not conflict with other
Now later after the reboot If I do not have 2nd string instead there is a new string
String1 = 12
String3HelloHowAreYou = 4567
String2Hello123HowAreyou = 28
Not that the number 25 generated for 2nd string earlier can not be sued for the new string.
Using extra storage (temp tables) is not allowed
if you dont care where the data comes from:
with dat as (
select 't1' src, id from table1
union all
select 't2' src, id from table2
union all
select 't3' src, id from table3
)
select *
, id2 = row_number() over( order by _some_column_ )
from dat

CDC in sql server

i have enabled CDC feature on one of my database. now i have below table data in cdc tables
MemberID LastName __$operation
1 David 4
1 Dave 4
2 Jimmy 4
2 Test 4
Now my problem is that i have to query the cdc table and get all the rows that are the latest one for all the members (most recent updated value). for example the query would return
MemberID LastName __$operation
1 Dave 4
2 Test 4
In addition to the _$operation column, there are also the _$start_lsn and __$seq_val columns. Ordering by those two should get you there.
You can not only determine by _$operations for CDC. If you want to do it correct use other column fields that are:
__$start_lsn
__$end_lsn
__$seqval
__$update_mask
So I'm not 100% sure I understand what you are asking for, but if you need the latest values for all the members in the table then ignore the CDC table and just query the table itself as this is where all the latest values are afterall.
If you need to see the latest values for all the members that have been changed within a certain time period, then you should use the cdc.fn_cdc_get_net_changes_(capture_instance) function, detailed here:
cdc.fn_cdc_get_net_changes
This allows you to specify a start and end date for the capture period (via the sys.fn_cdc_map_time_to_lsn function which allows you to map the LSNs to actual times) and it will then output the net changes to the table within this period.
The cdc.fn_cdc_get_net_changes_(capture_instance) changes is generated depending on your table name, so as you have not specified what this is, I have called it dbo_members, please change as required, here is an example of how you can get a list of the latest values for all changed members within the last day using the functions detailed above:
DECLARE #begin_time DATETIME ,
#end_time DATETIME ,
#begin_lsn BINARY(10) ,
#end_lsn BINARY(10);
SELECT #begin_time = GETDATE() - 1 ,
#end_time = GETDATE();
SELECT #begin_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than',
#begin_time);
SELECT #end_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal',
#end_time);
SELECT [MemberID] ,
[LastName]
FROM cdc.fn_cdc_get_net_changes_dbo_members(#begin_lsn, #end_lsn, 'all')
GO
As per steoleary you can simply check the data table for the latest values and ignore CDC altogether, but if you are looking to what changed with values from and to, then you will need to refer to the _$operation values 3 (deleted) and 4 (inserted) values in conjunction with the __$start_lsn. The inserted and deleted values correspond to those tables you would use when writing triggers btw.
To just see what column values changes as a precursor to actually evaluating those values, then you can use the __$update_mask column, tied into the cdc.captured_columns table which will provide you the actual column names, by implementing the sys.fn_cdc_is_bit_set(captured_columns.column_ordinal, __$update_mask) function where the result = 1.
Welcome to the wacky world of CDC and the copious amounts of late nights and caffeine hits required to even attempt to master it!
If your cdc system table name is cdc.dbo_demo_ct then with following query you will get desired result:
SELECT *
FROM (SELECT Row_number() OVER (partition BY a.MemberID ORDER BY b.tran_end_time DESC) t,
*
FROM cdc.dbo_demo_ct a
INNER JOIN cdc.lsn_time_mapping b
ON a.__$start_lsn = b.start_lsn) T
WHERE T.t = 1

Normalizing a table, from one to the other

I'm trying to normalize a mysql database....
I currently have a table that contains 11 columns for "categories". The first column is a user_id and the other 10 are category_id_1 - category_id_10. Some rows may only contain a category_id up to category_id_1 and the rest might be NULL.
I then have a table that has 2 columns, user_id and category_id...
What is the best way to transfer all of the data into separate rows in table 2 without adding a row for columns that are NULL in table 1?
thanks!
You can create a single query to do all the work, it just takes a bit of copy and pasting, and adjusting the column name:
INSERT INTO table2
SELECT * FROM (
SELECT user_id, category_id_1 AS category_id FROM table1
UNION ALL
SELECT user_id, category_id_2 FROM table1
UNION ALL
SELECT user_id, category_id_3 FROM table1
) AS T
WHERE category_id IS NOT NULL;
Since you only have to do this 10 times, and you can throw the code away when you are finished, I would think that this is the easiest way.
One table for users:
users(id, name, username, etc)
One for categories:
categories(id, category_name)
One to link the two, including any extra information you might want on that join.
categories_users(user_id, category_id)
-- or with extra information --
categories_users(user_id, category_id, date_created, notes)
To transfer the data across to the link table would be a case of writing a series of SQL INSERT statements. There's probably some awesome way to do it in one go, but since there's only 11 categories, just copy-and-paste IMO:
INSERT INTO categories_users
SELECT user_id, 1
FROM old_categories
WHERE category_1 IS NOT NULL