Query rows and include rows with columns reversed - sql

I'm trying to query a table. I want the results to include the FROM and TO columns, but then also include rows with these two values reversed. And then I want to eliminate all duplicates. (A duplicate is the same two cities in the same order.)
For example, given this data.
Trips
FROM TO
-------------------- --------------------
West Jordan Taylorsville
Salt Lake City Ogden
West Jordan Taylorsville
Sandy South Jordan
Taylorsville West Jordan
I would want the following results.
West Jordan Taylorsville
Taylorsville West Jordan
Salt Lake City Ogden
Ogden Salt Lake City
Sandy South Jordan
South Jordan Sandy
I want to do this using C# and Entity Framework, but I could use raw SQL if I need to.
Is it possible to do this in a query, or do I need to manually perform some of this logic?

Not sure if I'm following, but doesn't just a simple union work for your sample?
select from, to
from some_table
union
select to, from
from some_table

I do believe the first sub query should handle the first part of your question. the WHERE ID NOT IN will handle the second part of your question.
SELECT *
FROM
(
SELECT *
FROM Trips
WHERE ID IN (
SELECT ID
FROM Trips t1
INNER JOIN Trips AS t2
ON t2.To = t1.From AND t2.From = t1.To
)
)
WHERE ID NOT IN
(
SELECT MIN(ID)
FROM Trips
GROUP BY [From], [To]
)
I am assuming there is more to the table than just those fields. Usually you have a field (primary key) to uniquely identify the row. I am using ID for that field, replace with whatever your table is using.

Related

SQL exclude values that are in another data frame column

Say I have two tables
First_table
id
occupation
efg
carpenter
hjk
teacher
moo
scientist
dss
engineer
Second_table
id
state
efg
PA
loi
DE
moo
NY
nbw
MD
Now I want to write a query that gets rid of the rows of the first table, if first_table.id is in second_table.id. So the output would be
id
occupation
hjk
teacher
dss
engineer
One way I could do this is by writing a where clause, and then put parameters into the where clause such as
where first_table.id != moo and first_table.id != efg
but that would require me to write some logic to figure out which data to exclude, and I would want all the logic to be in a query anyways.
This sounds like not exists:
select f.*
from first_database f
where not exists (select 1 from second_database s where s.id = f.id);

SQL Insert with value from different table

I have 2 tables storing information. For example:
Table 1 contains persons:
ID NAME CITY
1 BOB 1
2 JANE 1
3 FRED 2
The CITY is a id to a different table:
ID NAME
1 Amsterdam
2 London
The problem is that i want to insert data that i receive in the format:
ID NAME CITY
1 PETER Amsterdam
2 KEES London
3 FRED London
Given that the list of Cities is complete (i never receive a city that is not in my list) how can i insert the (new/received from outside)persons into the table with the right ID for the city?
Should i replace them before I try to insert them, or is there a performance friendly (i might have to insert thousands of lines at one) way to make the SQL do this for me?
The SQL server i'm using is Microsoft SQL Server 2012
First, load the data to be inserted into a table.
Then, you can just use a join:
insert into persons(id, name, city)
select st.id, st.name, c.d
from #StagingTable st left join
cities c
on st.city = c.name;
Note: The persons.id should probably be an identity column so it wouldn't be necessary to insert it.
insert into persons (ID,NAME,CITY) //you dont need to include ID if it is auto increment
values
(1,'BOB',(select Name from city where ID=1)) //another select query is getting Name from city table
if you want to add 1000 rows at a time that'd be great if you use stored procedure like this link

Need a little SQL help - Getting number of items in common

Imagine I have a table like such
UserID Name Hobbies
00001 Jim Baseball, Hockey, Astonomy
00002 Jack Baseball, Football, Video Games
00003 Jill Astronomy, Shopping, Soccer
00004 Jane Hockey, Astronomy, Video Games
00005 Jacob Football, Basketball, Video Games
Now, what I want to do is get a count of hobbies in common. So, let's say I plug in 00001 into a textbox or query string or whatever. I want to see something like:
Name Hobbies
Jack You have (1) hobby in common
Jill You have (1) hobby in common
Jane You have (2) hobbies in common
Jacob You have (0) hobbies in common
How would I write the code for that? I'm stumped. I'm thinking it's got to do with string matching, but I have no idea how to do that.
The first choice is to fix your data structure. Comma-delimited lists are bad, bad, bad. A separate table storing one row per person and per hobby is good, good, good.
If you are stuck with someone else's bad decisions, there is a little recourse. First Google "sql server split" and get your favorite string splitting function.
Then, you can do:
with t as (
select t.*, s.val as hobby
from table t cross apply
dbo.split(t.Hobbies, ', ') as s(val) -- Note, some `split()` implementations also have a `pos` value
)
select t.userName, count(tuser.userId) as NumInCommon
from t left join
t tuser
on t.hobby = tuser.hobby and tuser.userId = '00001'
group by t.userId, t.userName;
It is not worth constructing the full sentence in SQL, unless you really want to. Use SQL primarily to get the data you want. (Formatting in SQL can be useful sometimes, but it is really more for the application code.)
create table #temp_hobbies
(hobby_id int
,hobby varchar(50))
insert into #temp_hobbies values
(1, 'football')
,(2,'baseball')
create table #temp_people
(user_ids int,
name varchar(50),
hobby_ids int)
insert into #temp_people values
(01,'Adam',1)
,(01,'Adam',2)
,(02,'Dave',1)
,(03,'Matt',2)
select count(distinct hobby) , count(distinct name)
from #temp_hobbies a
inner join #temp_people b on a.hobby_id = b.hobby_ids
part of your solution you now need to add query that will give computed column of each user's hobby compared to other.
But per other user's try seperating hobby's into a seperate table and use int to do joins. Sql server is faster to process ints than varchar's esp if you will need to do this for thousand's of records.
First of all please NORMALIZE your data. you can see lot of repeatating hobbies in each row, also it will be tedious to serach and for maintainability.
you can have all your USERS data in one table as below :
CREATE TABLE USERS ( UserID , NAME ); --> USERID being PRIMARY KEY
you can have all your HOBBIES in another table as below :
CREATE TABLE HOBBIES ( HOBBYID, HOBBYNAME); --> HOBBYID being PRIMARY KEY
you can have another table which maps USERS with HOBBIES as below :
CREATE USERS_HOBBIES( USERID , HOBBYID );
once the table is normalized as above, you can get the desired result by querying as below :
SELECT u.NAME , count(*) AS Hobbies FROM USERS u INNER JOIN
USERS_HOBBIES uh ON u.UserID = uh.USERID INNER JOIN HOBBIES h ON
uh.HOBBYID = h.HOBBYID WHERE h.HOBBYID IN (
(SELECT a.HOBBYID as HOBBYID FROM
(SELECT DISTINCT(HOBBYID) as HOBBYID FROM USERS_HOBBIES WHERE
USERID = '00001' ) a INNER JOIN
(SELECT DISTINCT(HOBBYID) as HOBBYID FROM USERS_HOBBIES WHERE
USERID <> '00001' ) b ON a.HOBBYID = b.HOBBYID) )
AND u.USERID = '00001' GROUP BY u.NAME
P.S : The above query syntax is in ORACLE

Include a string in Select Query

I'm wondering can we do this query below?
SELECT America, England, DISTINCT (country) FROM tb_country
which will (my intention is to) display :
America
England
(List of distinct country field in tb_country)
So the point is to display (for example) America and England even if the DISTINCT country field returns nothing. Basically I need this query to list a select dropdown, and give some sticky values user can pick, while allowing themselves to add a new country as they wish.
It also goes without saying, that should one row in the tb_country has a value of America or England, they will not show as a duplicate in the query result. So if the tb_country has list of values :
Germany
England
Holland
The query will only output :
America
England
Germany
Holland
You need to use a UNION:
SELECT 'America' AS country
UNION
SELECT 'England' AS country
UNION
SELECT DISTINCT(c.country) AS country
FROM TB_COUNTRY c
UNION will remove duplicates; UNION ALL will not (but is faster for it).
The data type must match for each ordinal position in the SELECT clause. Meaning, if the first column in the first query were INT, the first column for all the unioned statements afterwards need to be INT as well or NULL.
Why you do not add a weight column in tb_country and use a order clause :
Perform once:
update country set weight = 1 where country = 'England';
update country set weight = 1 where country = 'America';
Then use it:
select distinct(country) from tb_country order by desc weight ;
Another way is to use an extra country table with two columns (country, weight) and an outer join.
Personnaly I rather prefer a country table with a UNIQUE constraint for country field and
Use of a foreign key.

UPDATE query that fixes orphaned records

I have an Access database that has two tables that are related by PK/FK. Unfortunately, the database tables have allowed for duplicate/redundant records and has made the database a bit screwy. I am trying to figure out a SQL statement that will fix the problem.
To better explain the problem and goal, I have created example tables to use as reference:
alt text http://img38.imageshack.us/img38/9243/514201074110am.png
You'll notice there are two tables, a Student table and a TestScore table where StudentID is the PK/FK.
The Student table contains duplicate records for students John, Sally, Tommy, and Suzy. In other words the John's with StudentID's 1 and 5 are the same person, Sally 2 and 6 are the same person, and so on.
The TestScore table relates test scores with a student.
Ignoring how/why the Student table allowed duplicates, etc - The goal I'm trying to accomplish is to update the TestScore table so that it replaces the StudentID's that have been disabled with the corresponding enabled StudentID. So, all StudentID's = 1 (John) will be updated to 5; all StudentID's = 2 (Sally) will be updated to 6, and so on. Here's the resultant TestScore table that I'm shooting for (Notice there is no longer any reference to the disabled StudentID's 1-4):
alt text http://img163.imageshack.us/img163/1954/514201091121am.png
Can you think of a query (compatible with MS Access's JET Engine) that can accomplish this goal? Or, maybe, you can offer some tips/perspectives that will point me in the right direction.
Thanks.
The only way to do this is through a series of queries and temporary tables.
First, I would create the following Make Table query that you would use to create a mapping of the bad StudentID to correct StudentID.
Select S1.StudentId As NewStudentId, S2.StudentId As OldStudentId
Into zzStudentMap
From Student As S1
Inner Join Student As S2
On S2.Name = S1.Name
Where S1.Disabled = False
And S2.StudentId <> S1.StudentId
And S2.Disabled = True
Next, you would use that temporary table to update the TestScore table with the correct StudentID.
Update TestScore
Inner Join zzStudentMap
On zzStudentMap.OldStudentId = TestScore.StudentId
Set StudentId = zzStudentMap.NewStudentId
The most common technique to identify duplicates in a table is to group by the fields that represent duplicate records:
ID FIRST_NAME LAST_NAME
1 Brian Smith
3 George Smith
25 Brian Smith
In this case we want to remove one of the Brian Smith Records, or in your case, update the ID field so they both have the value of 25 or 1 (completely arbitrary which one to use).
SELECT min(id)
FROM example
GROUP BY first_name, last_name
Using min on ID will return:
ID FIRST_NAME LAST_NAME
1 Brian Smith
3 George Smith
If you use max you would get
ID FIRST_NAME LAST_NAME
25 Brian Smith
3 George Smith
I usually use this technique to delete the duplicates, not update them:
DELETE FROM example
WHERE ID NOT IN (SELECT MAX (ID)
FROM example
GROUP BY first_name, last_name)