Split delimited entries into new rows in Access

Split delimited entries into new rows in Access - sql

So someone gave me a spreadsheet of orders, the unique value of each order is the PO, the person that gave me the spreadsheet is lazy and decided for orders with multiple PO's but the same information they'd just separate them by a "/". So for instance my table looks like this
PO Vendor State
123456/234567 Bob KY
345678 Joe GA
123432/123456 Sue CA
234234 Mike CA
What I hoped to do as separate the PO using the "/" symbol as a delimiter so it looks like this.
PO Vendor State
123456 Bob KY
234567 Bob KY
345678 Joe GA
123432 Sue CA
123456 Sue CA
234234 Mike CA
Now I have been brainstorming a few ways to go about this. Ultimately I want this data in Access. The data in its original format is in Excel. What I wanted to do is write a vba function in Access that I could use in conjunction with a SQL statement to separate the values. I am struggling at the moment though as I am not sure where to start.

About about this:
1) Import the source data into a new Access table called SourceData.
2) Create a new query, go straight into SQL View and add the following code:
SELECT * INTO ImportedData
FROM (
SELECT PO, Vendor, State
FROM SourceData
WHERE InStr(PO, '/') = 0
UNION ALL
SELECT Left(PO, InStr(PO, '/') - 1), Vendor, State
FROM SourceData
WHERE InStr(PO, '/') > 0
UNION ALL
SELECT Mid(PO, InStr(PO, '/') + 1), Vendor, State
FROM SourceData
WHERE InStr(PO, '/') > 0) AS CleanedUp;
This is a 'make table' query in Access jargon (albeit with a nested union query); for an 'append' query instead, alter the top two lines to be
INSERT INTO ImportedData
SELECT * FROM (
(The rest doesn't change.) The difference is that re-running a make table query will clear whatever was already in the destination table, whereas an append query adds to any existing data.
3) Run the query.

If I had to do it I would
Import the raw data into a table named [TempTable].
Copy [TempTable] to a new table named [ActualTable] using the "Structure Only" option.
Then, in a VBA routine I would
Open two DAO recordsets, rstIn for [TempTable] and rstOut for [ActualTable]
Loop through the rstIn recordset.
Use the VBA Split() function to split the [PO] values on "/" into an array.
For Each array item I would use rstOut.AddNew to write a record into [ActualTable]

In Excel, try this:
IF(ISERROR(FIND("/",A1,1))=TRUE,A1, Left(A1, 6))
In the cell next to it, put something like this:
IF(ISERROR(FIND("/",A1,1))=TRUE,"", Right(A1, 6))
This will break out your PO's into 2 columns. From there, write a loop that creates a new record where appropriate.

Related

SQL query to get the first letter of each word and seperate it by a dot and a space

I have never really used SQL much but recent changes due to working from home is forcing me to gain some knowledge in it. I have been doing fine so far but I am now running into a problem that I can't seem to find a solution for.
I have an excel sheet that pulls customer information trough a SQL query which runs by VBA code.
What I first needed to do is to get a full name from a customer and input this into the sheet. This works fine. I am using the following query for this:
Select concat(concat(Customer_First_Names,' '), Customer_Last_Name) FROM CustomerInformationTable where Customer_Number = &&1
This gives me the full name of a customer and spaces in between the first and last name and in between the names (the full first names are already spaced in between in the table).
Now, I got another request to not retrieve the first full first names and last name of a customer, but their initials and the last name.
For example:
Barack Hussein Obama
Would become
B. H. Obama
I need to do 2 things for this:
I need to change my query to retrieve only the initials for each first name. Like I said, all full first names (even if a customer has more then one first name) is located in the column Customer_First_Names.
I need to add a dot and a space after each initial.
How would I go on about this?
I have been thinking about using SUBSTRING but I am struggling on how to do this if there is more then one first name.
So this is not going to work:
Select concat(substr(Customer_First_Names, 1, 1), '. ') from CustomerInformationTable where Customer_Number = &&1
My apologies if this has already been ask on the board so far, I looked but I did not find a suitable solution.

Assuming you don't want to see 2 dots after someone who has just one first name (like J.. Smith), then here's a solution that works in postgres. Not sure what your db is, so you may need to adjust as needed.
The 'with' query is splitting apart the first names, limiting to two.
The 'case' statement then checks if the person has a second first name. If not, then only the first initial is provided and followed by a dot. Otherwise, both first initials are followed by a dot. Final results, all initials and names are separated by a space (like T. R. Smith).
So, a table looking like this:
cid first last
1 JAKE SMITH
2 TERREL HOWARD WILLIAMS
3 PHOEBE M KATES
Will produce the following results with the query below.
cid cust_name
1 J. SMITH
2 T. H. WILLIAMS
3 P. M. KATES
with first_names as
(select distinct customer_number ,
split_part(customer_first_name, ' ', 1) as first1,
split_part(customer_first_name, ' ', 2) as first2
from CustomerInformationTable
)
select distinct customer_number,
case
when fn.first2 = '' then substring(fn.first1, 1, 1) || '.'
else substring(fn.first1, 1, 1) || '. ' || substring(fn.first2, 1, 1) || '.'
end
|| ' ' || a.customer_last_name as cust_name
from CustomerInformationTable a
join first_names fn on fn.customer_number = a.customer_number

How to match entires in SQL based on their ending letter?

So I'm trying to match entries in two databases so in the new table the row is comprised of two words that end in the same ending letter. I'm working with two tables that have one column in each of them, each named word. table 1 contains the following data in order: Dog, High, It, Weeks, while table two contains the data: Bat, Is, Laugh, Sing. I need to select from both of these tables and match the words so that each row is as follows: Dog | Sing, High | Laugh, It | Bat, Weeks | Is
The screenshot is what I have so far for my SQL statement. I'm still early on in learning SQL so any info to help on this would be appreciated.

Recommend reading up on SUBSTR() for more information on why the below code works: https://docs.oracle.com/cd/B28359_01/olap.111/b28126/dml_functions_2101.htm#OLADM679
SELECT
a.word
, b.word
FROM sec1313_words1 a
JOIN sec1313_words2 b
ON SUBSTR(b.word, -1) = SUBSTR(a.word, -1)
ORDER BY a.word

SQL Query - More options and suggestions apart from pivoting

New to SQL please dont mind if this is a silly question..
My table looks like this
I want to send only one email to manager telling him that these employees in your group failed to fill timesheet.
currently i have pivoted the above table that looks like this
and sending emails by concatinating firstemp+secondemp+thirdemp+------
can this be done in any more easiest way..?

You can use CONCAT() function to retrieve a single row data in one column
SELECT M_EMAIL,
CONCAT(FIRSTEMP, SECONDEMP, THIRDEMP, FOURTHEMP, FIFTHEMP...)
FROM 'your_table';
CONCAT() replaces NULL values with an empty string.

Please don't pivot, as the concat is really ugly to maintain (and will break if a more capable manager pops up with more subordinates than your concat columns).
The syntax depends on what SQL server you use. For example, in MSSQL you could do this:
select manager, m_email, STRING_AGG(employee, ', ') as subordinates
from Employee
group by manager, m_email
This result has only 1 row per manager and fixed number of columns regardless how many subordinates the manager has:
manager | m_email | subordinates
----------------------------------
A | A#A.COM | b, D
D | D#D.COM | e, h
I | I#I.COM | j
Play with the example here: http://sqlfiddle.com/#!18/73bb5/5
Another option is just query relevant data to application layer and do the grouping there.

SQL Server Multiple Likes

I have an unusual question that seems simple but has me stumped in a SQL Server stored procedure.
I have 2 tables as described below.
tblMaster
ID, CommitDate, SubUser, OrigFileName
Sample data
ID CommitDate SubUser OrigFileName
----------------------------------------
1 2014-10-07 Test1 Test1.pdf
2 2014-10-08 Test2 Test2.pdf
3 2014-10-09 Test3 Test3.pdf
The above table is basically the header table that tracks the committed files. In addition to this, we have a details table with the following structure.
tblIndex
ID, FileID (Linking column to the header row described above), Word
Sample data:
1. 1, 1, Oil
2. 2, 1, oil
3. 3, 2, oil
4. 4, 2, tank
5. 5, 3, tank
The above rows represent the words that we want to search on and if a certain criteria matches return the corresponding filename/header row ID. What I would love to figure out to do is if I do a search for
One word (i.e. "oil"), then the system should respond with all the files that meet the criteria (easiest case and figured out)
If more than one word is searched for (i.e. "oil" and "tank"), then we should only see the second file since it is the only one that has both oil and tank as its key words.
Tried using a LIKE "%oil%" AND LIKE "%tank%" and that resulted in no rows being created since one value can't be both oil and tank.
Tried doing a LIKE "%oil%" OR LIKE "%tank%" but I get files 1, 2, and 3 since the OR is inclusive of all the other rows.
One last thing, I recognize I could just do a search for the first term and then save the results into a temp table and then search for the second term in that second table and I will get what I am looking for. The problem with that is that I don't exactly know how many items will be searched for. I don't want to have to create a structure where I am constantly having to store data into another temp table if someone does a search for 6 "keywords".
Any help/ideas will be much appreciated.

try this ! slightly differing from the previous answer
SELECT distinct FileID,COUNT(distinct t.word) FROM tblIndex t
WHERE t.Word LIKE '%oil%' OR t.Word LIKE '%tank%'
GROUP BY FileID
HAVING COUNT(distinct t.word) > 1

One simple option would be to do something like this :
SELECT FileID
FROM tblIndex t
WHERE t.Word LIKE '%oil%' OR t.Word LIKE '%tank%'
GROUP BY FileID
HAVING COUNT(*) > 1
This assume you do not have duplicate in your tblIndex.
I'm also unsure whether you really need the like or not. According to your sample data you don't, a basic comparison would be way more efficient and would avoid possible collisions.

SQL Query with multiple values in one column

I've been beating my head on the desk trying to figure this one out. I have a table that stores job information, and reasons for a job not being completed. The reasons are numeric,01,02,03,etc. You can have two reasons for a pending job. If you select two reasons, they are stored in the same column, separated by a comma. This is an example from the JOBID table:
Job_Number User_Assigned PendingInfo
1 user1 01,02
There is another table named Pending, that stores what those values actually represent. 01=Not enough info, 02=Not enough time, 03=Waiting Review. Example:
Pending_Num PendingWord
01 Not Enough Info
02 Not Enough Time
What I'm trying to do is query the database to give me all the job numbers, users, pendinginfo, and pending reason. I can break out the first value, but can't figure out how to do the second. What my limited skills have so far:
select Job_number,user_assigned,SUBSTRING(pendinginfo,0,3),pendingword
from jobid,pending
where
SUBSTRING(pendinginfo,0,3)=pending.pending_num and
pendinginfo!='00,00' and
pendinginfo!='NULL'
What I would like to see for this example would be:
Job_Number User_Assigned PendingInfo PendingWord PendingInfo PendingWord
1 User1 01 Not Enough Info 02 Not Enough Time
Thanks in advance

You really shouldn't store multiple items in one column if your SQL is ever going to want to process them individually. The "SQL gymnastics" you have to perform in those cases are both ugly hacks and performance degraders.
The ideal solution is to split the individual items into separate columns and, for 3NF, move those columns to a separate table as rows if you really want to do it properly (but baby steps are probably okay if you're sure there will never be more than two reasons in the short-medium term).
Then your queries will be both simpler and faster.
However, if that's not an option, you can use the afore-mentioned SQL gymnastics to do something like:
where find ( ',' |fld| ',', ',02,' ) > 0
assuming your SQL dialect has a string search function (find in this case, but I think charindex for SQLServer).
This will ensure all sub-columns begin and start with a comma (comma plus field plus comma) and look for a specific desired value (with the commas on either side to ensure it's a full sub-column match).
If you can't control what the application puts in that column, I would opt for the DBA solution - DBA solutions are defined as those a DBA has to do to work around the inadequacies of their users :-).
Create two new columns in that table and make an insert/update trigger which will populate them with the two reasons that a user puts into the original column.
Then query those two new columns for specific values rather than trying to split apart the old column.
This means that the cost of splitting is only on row insert/update, not on _every single select`, amortising that cost efficiently.
Still, my answer is to re-do the schema. That will be the best way in the long term in terms of speed, readable queries and maintainability.

I hope you are just maintaining the code and it's not a brand new implementation.
Please consider to use a different approach using a support table like this:
JOBS TABLE
jobID | userID
--------------
1 | user13
2 | user32
3 | user44
--------------
PENDING TABLE
pendingID | pendingText
---------------------------
01 | Not Enough Info
02 | Not Enough Time
---------------------------
JOB_PENDING TABLE
jobID | pendingID
-----------------
1 | 01
1 | 02
2 | 01
3 | 03
3 | 01
-----------------
You can easily query this tables using JOIN or subqueries.
If you need retro-compatibility on your software you can add a view to reach this goal.

I have a tables like:
Events
---------
eventId int
eventTypeIds nvarchar(50)
...
EventTypes
--------------
eventTypeId
Description
...
Each Event can have multiple eventtypes specified.
All I do is write 2 procedures in my site code, not SQL code
One procedure converts the table field (eventTypeIds) value like "3,4,15,6" into a ViewState array, so I can use it any where in code.
This procedure does the opposite it collects any options your checked and converts it in

If changing the schema is an option (which it probably should be) shouldn't you implement a many-to-many relationship here so that you have a bridging table between the two items? That way, you would store the number and its wording in one table, jobs in another, and "failure reasons for jobs" in the bridging table...

Have a look at a similar question I answered here
;WITH Numbers AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 0)) AS N
FROM JobId
),
Split AS
(
SELECT JOB_NUMBER, USER_ASSIGNED, SUBSTRING(PENDING_INFO, Numbers.N, CHARINDEX(',', PENDING_INFO + ',', Numbers.N) - Numbers.N) AS PENDING_NUM
FROM JobId
JOIN Numbers ON Numbers.N <= DATALENGTH(PENDING_INFO) + 1
AND SUBSTRING(',' + PENDING_INFO, Numbers.N, 1) = ','
)
SELECT *
FROM Split JOIN Pending ON Split.PENDING_NUM = Pending.PENDING_NUM
The basic idea is that you have to multiply each row as many times as there are PENDING_NUMs. Then, extract the appropriate part of the string

While I agree with DBA perspective not to store multiple values in a single field it is doable, as bellow, practical for application logic and some performance issues. Let say you have 10000 user groups, each having average 1000 members. You may want to have a table user_groups with columns such as groupID and membersID. Your membersID column could be populated like this:
(',10,2001,20003,333,4520,') each number being a memberID, all separated with a comma. Add also a comma at the start and end of the data. Then your select would use like '%,someID,%'.
If you can not change your data ('01,02,03') or similar, let say you want rows containing 01 you still can use " select ... LIKE '01,%' OR '%,01' OR '%,01,%' " which will insure it match if at start, end or inside, while avoiding similar number (ie:101).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas