I’m working on this dataset to cleaning it
https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results, using Dremio (an online tool) so I can use an SQL editor (but i don't know what DBMS it use).
Now I’m trying to delete from the column Event the words that are contained in the column Sport. (I’ve already done some modification indeed in the column Event I’ve deleted the occurance of the words “man’s” and “women’s”).
Attached you’ll find
The current situtation and the desired result
How can I solve the problem?
I hope I have been clear, Thank you in advance for help. :)
Edit: I've found the original query made by Dremio
SELECT ID, Name, Gender, Age, Height, Weight, Team, "Olympic Games"."Year" AS "Year", Season, City, Sport, CASE WHEN regexp_like(CASE WHEN regexp_like(Event, '.*?\QMen''s\E.*?') THEN regexp_replace(Event, '\QMen''s\E', '') ELSE Event END, '.*?\QWomen''s\E.*?') THEN regexp_replace(CASE WHEN regexp_like(Event, '.*?\QMen''s\E.*?') THEN regexp_replace(Event, '\QMen''s\E', '') ELSE Event END, '\QWomen''s\E', '') ELSE CASE WHEN regexp_like(Event, '.*?\QMen''s\E.*?') THEN regexp_replace(Event, '\QMen''s\E', '') ELSE Event END END AS Event, Medal
FROM "#Sboorn"."Olympic Games"
WHERE NOT regexp_like(ID, '.*?\QID\E.*?')
You can use a CASE to heck if event starts with the sport and a space. If so use substring() to omit the first n characters for n the length of sport and the space. Else return event unchanged.
SELECT sport,
CASE
WHEN event LIKE concatenate(sport, ' %') THEN
substring(event, length(sport) + 2, length(event) - length(sport) - 1)
ELSE
event
END event
FROM elbat;
As you didn't tag your actual DBMS, the names of the functions might differ (e.g. concat() instead of concatenate(), substr() instead of substring() or len() instead of length()). But some equivalent should be available in most DBMS.
Depending on the actual DBMS there also might be more elegant solutions, like regular expressions.
And next time please don't post images. Use CREATE TABLE and INSERT INTO statements to show how your tables look like and plain text to show the desired result.
Related
I have been using sql for quite a time but unable to figure out below query logic.
I'm extracting two values
First_name i.e abc
FIRST_NAMES_LIST (list containing first names) i.e ['abc','abc','cba','dba'] (this may contain junk values also in between strings)
I trying to search first_name in first_name_list and return 1 or 0, using below logic
CASE FIRST_NAME in FIRST_NAMES_LIST then 1
else 0
but this isn't giving correct result
Can somebody please help.
Thanks,
Naseer
Look this information:
https://www.techonthenet.com/oracle/functions/instr.php
INSTR is a function which return <> 0 if your parameter match. If not it return 0.
I no have any clear example in your ennunciate to give you the correct answer. See the functionalities.
Regards!
Ideally you would parse out your json into a table object ( I don't know how to do that in Oracle) and then search your table object where the object contains the value, but that is pretty expensive. It would be more robust and would be able to handle special characters/corner cases better.
On the other hand, if the names are going to stay simple (ie, no quote marks ' or commas), you could use a LIKE expression and search the string.
CASE WHEN FIRST_NAMES_LIST LIKE '%''' + FIRST_NAME + '''%' THEN 1 ELSE 0 END
Yes im currently using INSTR but query is taking bit of time. hope resolves this issue. thanks.
Is it possible to use stuff function with a condition.
I'm trying to create a regex pattern from the values in SQL table.
My stuff function looks like this:
stuff(name,patindex('%Apple%',name),len(name),'%')
But I also need to run
stuff(name,patindex('%Mango%',name),len(name),'%')
Can I do both in the same stuff function with a OR condition?
Your exact logic or expected result is not entirely clear, but you could try writing the above using a CASE expression:
STUFF(name,
CASE WHEN PATINDEX('%Apple%', name) < PATINDEX('%Mango%', name)
THEN PATINDEX('%Apple%', name)
ELSE PATINDEX('%Mango%', name) END,
LEN(name), '%')
The logic here is to choose the starting point for the STUFF operation based on which fruit substring appears first in the name.
New to TSQL and SQL generally, please pardon if this is really basic:
I am working with a new-to-me-database that has ignored some best practices. Relevant to this discussion, some data is stored in a generalized note field, including loyalty numbers. The good news is that the loyalty numbers are at least stored consistently within the note.
So, a simplified example from the note table might be:
I have verified that every Loyalty Number is stored consistently ("Loyalty Number ####"), but obviously this is not ideal. I want to extract the Loyalty Number for every primary key that has them, then create a new field that stores the Loyalty Number.
What I'm having trouble with is the following: How do I run a query that will give me each primary key then, if there is a loyalty number return it, if not leave it null or say something like no result found. E.g., turn the above into something like.
It's trivially easy to construct something like "select primary_key, note from note_table where note like '%Loyalty Number%', but that doesn't do the job of clipping down to just the loyalty number (and leaving out extraneous text). The uniformity of the data means I could probably do this in Excel, but I'm wondering if it's possible in TSQL. Thanks in advance for your help.
Give something like this a try using case with substring and charindex:
select id,
case when note like '%Loyalty Number [0-9][0-9][0-9][0-9]%'
then 'Loyalty Number ' +
substring(note,
charindex('Loyalty Number', note) + Len('Loyalty Number ') + 1, 4)
end as Note
from note
SQL Fiddle Demo
The case statement checks to see if Loyalty Number exists in the data. Substring splits the note field using charindex to find the starting position. This is hard coding a length of 4 characters for the loyalty number. Given your comments, this should work. If you have a dynamic number of characters, you'll need to modify this slightly.
Building on #segeddes answer, here's the rest of the code, that will update your new LoyaltyNumber column.
Working SQL Fiddle: http://sqlfiddle.com/#!3/36e46/8
UPDATE note_table
SET LoyaltyNumber =
CASE
WHEN note LIKE '%Loyalty Number [0-9][0-9][0-9][0-9]%'
THEN SUBSTRING(note, CHARINDEX('Loyalty Number', note)
+ LEN('Loyalty Number ') + 1, 4)
ELSE 'Regular Customer'
END
FROM note_table
Table Definition and CRUD
CREATE TABLE note_table (
id int identity(1,1),
Note VarChar(500),
LoyaltyNumber varchar(20)
)
Insert Into note_table(Note) Values
('Customer Since 2012. Loyalty Number 4747'),
('Loyalty Number 2209'),
('Loyalty Number 2234.Customer Since 2009'),
('Pending Order');
I am working with a table that contains two versions of stored information. To simplify it, one column contains the old description of a file run while another column contains the updated standard for displaying ran files. It gets more complicated in that the older column can have multiple standards within itself. The table:
Old Column New Column
Desc: LGX/101/rpt null
null Home
Print: LGX/234/rpt null
null Print
null Page
I need to combine the two columns into one, but I also need to delete the "Print: " and "Desc: " string from the beginning of the old column values. Any suggestions? Let me know if/when I'm forgetting something you need to know!
(I am writing in Cache SQL, but I'd just like a general approach to my problem, I can figure out the specifics past that.)
EDIT: the condition is that if substr(oldcol,1,5) = 'desc: ' then substr(oldcol,6)
else if substr(oldcol,1,6) = 'print: ' then substr(oldcol,7) etc. So as to take out the "desc: " and the "print: " to sanitize the data somewhat.
EDIT2: I want to make the table look like this:
Col
LGX/101/rpt
Home
LGX/234/rpt
Print
Page
It's difficult to understand what you are looking for exactly. Does the above represent before/after, or both columns that need combining/merging.
My guess is that COALESCE might be able to help you. It takes a bunch of parameters and returns the first non NULL.
It looks like you're wanting to grab values from new if old is NULL and old if new is null. To do that you can use a case statement in your SQL. I know CASE statements are supported by MySQL, I'm not sure if they'll help you here.
SELECT (CASE WHEN old_col IS NULL THEN new_col ELSE old_col END) as val FROM table_name
This will grab new_col if old_col is NULL, otherwise it will grab old_col.
You can remove the Print: and Desc: by using a combination of CharIndex and Substring functions. Here it goes
SELECT CASE WHEN CHARINDEX(':',COALESCE(OldCol,NewCol)) > 0 THEN
SUBSTRING(COALESCE(OldCol,NewCol),CHARINDEX(':',COALESCE(OldCol,NewCol))+1,8000)
ELSE
COALESCE(OldCol,NewCol)
END AS Newcolvalue
FROM [SchemaName].[TableName]
The Charindex gives the position of the character/string you are searching for.
So you get the position of ":" in the computed column(Coalesce part) and pass that value to the substring function. Then add +1 to the position which indicates the substring function to get the part after the ":". Now you have a string without "Desc:" and "Print:".
Hope this helps.
I want have a query with a column that is a hardcoded value not from a table, can this be done? I need it basically as a placeholder that I am going to come back to later and fill in.
example:
SELECT
hat,
shoe,
boat,
somevalue = 0 as placeholder
FROM
objects
then I would loop through this query later and fill in the placeholder
in this example someValue is not a field in objects, I need to fake it. I am doing this in coldfusion and using two datasources to complete one query. I have tried the space() function but have been unable to get it to work.
Thanks.
SELECT
hat,
shoe,
boat,
0 as placeholder
FROM
objects
And '' as placeholder for strings.
This should work on most databases. You can also select a blank string as your extra column like so:
Select
Hat, Show, Boat, '' as SomeValue
From
Objects
For varchars, you may need to do something like this:
select convert(varchar(25), NULL) as abc_column into xyz_table
If you try
select '' as abc_column into xyz_table
you may get errors related to truncation, or an issue with null values, once you populate.
The answers above are correct, and what I'd consider the "best" answers. But just to be as complete as possible, you can also do this directly in CF using queryAddColumn.
See http://www.cfquickdocs.com/cf9/#queryaddcolumn
Again, it's more efficient to do it at the database level... but it's good to be aware of as many alternatives as possible (IMO, of course) :)
SELECT
hat,
shoe,
boat,
0 as placeholder -- for column having 0 value
FROM
objects
--OR '' as Placeholder -- for blank column
--OR NULL as Placeholder -- for column having null value
Thank you, in PostgreSQL this works for boolean
SELECT
hat,
shoe,
boat,
false as placeholder
FROM
objects