I have a table called "parts" which stores information on electrical connectors including contacts, backshells etc. all parts that are part of an assembly have a value in a column called "assemblyID". There is also a column called "partDefID", in this column connectors will have a value of 2, contacts will be 3. I need to get rows for all connectors that have a unique assemblyID. It's easy to get rows that represent connectors just by selecting rows with a partDefID of 2 but this will return multiple rows of connectors that may be part of the same assembly. I need only those rows of connectors with a unique assemblyID. How can I do this?
see image below:
what I am trying to get is just ONE of the rows shown below, any one of them would be fine as they are all part of the same assembly.
just one of these rows needed
[update]
it seems my question was not well formed and the use of images is frowned upon. Inserting a text version of my table looked REALLY horrible though! I'll try to do better. yes, I'm a newb at both sql AND this website
If you want just one "connector" row per assembly ID, you can filter with a subquery. Assuming that PartRefID is a unique key:
select *
from parts as p
where [PartRefID] = (
select max(p1.[PartRefID])
from parts as p1
where p1.[AssemblyID] = p.[AssemblyID] and p1.[PartDefID] = 2
)
I don't know if this is what you are looking for
SELECT assemblyid,count(partdefid)
FROM parts
WHERE partdefid=2
GROUP BY partdefid,assemblyid
HAVING COUNT(partdefid)=1
Related
I have two tables I am left joining together. The first tables has transnational level detail, causing the key I join to the second table to duplicate. When I left join the second table, the measure "company_spend" is highly inflated.
I need a way to keep only a single value of the duplicated data, and my thought was to run a distinct function on only those columns, but I am not seeing that Bigquery supports distinct functions on only a few columns, but not all.
SELECT UPPER(cwnextt.Current_Contract_Number) AS Current_Contract_Number,
UPPER(cwnextt.Replacement_Contract_Number) AS Replacement_Contract_Number,
UPPER(cwnextt.Current_Contract_Name) AS Current_Contract_Name,
UPPER(cwnextt.Supplier_Top_Parent_Entity_Code) AS Supplier_Top_Parent_Entity_Code,
UPPER(cwnextt.Supplier_Top_Parent_Name) AS Supplier_Top_Parent_Name,
UPPER(cwnextt.company_Entity_Code) AS company_Entity_Code,
UPPER(cwnextt.Facility_Name) AS Facility_Name,
smart.company_Spend AS companySpend
FROM `test_etl_field.contracts_with_member_entity_codes_test_view_2` cwnextt
--this table is what is causing the below table to duplicate,
--but I need all of this data AS well in its current format.
LEFT JOIN `test.trans_analysis` tsa
ON TRIM(UPPER(cwnextt.company_entity_code)) = TRIM(UPPER(tsa.company_entity_code))
AND TRIM(UPPER(cwnextt.Supplier_Top_Parent_Entity_Code)) = TRIM(UPPER(tsa.manufacturer_top_parent_entity_code))
AND TRIM(UPPER(cwnextt.Current_Contract_Name)) = TRIM(UPPER(tsa.contract_category))
AND cwnextt.spend_period_yyyyqmm = tsa.spend_period_yyyyqmm
--this table contains "company_spend" which is now duplicated
LEFT JOIN `test_etl_field.ecr_smart_data` smart
ON smart.company_entity_code = cwnextt.company_entity_code
AND (smart.contract_number = cwnextt.current_contract_number
OR smart.contract_number = cwnextt.replacement_contract_number)
AND smart.month_key = cwnextt.spend_period_yyyyqmm
If something can be created that will keep company_spend from duplicating on the second left join, that is what I am after.
Not sure to understand all the details of your problem but here's a fact from BigQuery doc :
SELECT DISTINCT
A SELECT DISTINCT statement discards duplicate rows
and returns only the remaining rows.
You can't apply DISTINCT on specific columns because it doesn't make sense. Let's say you have 4 columns and call DISTINCT on 3 columns, what is SQL supposed to do with the last one ?
You must tell SQL which value to keep for the remaining column and GROUP BY is the right solution here.
So if you want to:
Remove a column that has been duplicated : Just adjust your SELECT to get only the columns you want
Remove lines that have the same value in specific columns : I would suggest a GROUP BY on the targeted column and taking the aggregation you want (first, avg, sum or whatever) for the remaining ones.
Remove the value from a row if another row has the same : You may not want to do that. A row has to keep its value and you won't get it back. Besides, same problem, which row do you want to keep ?
Hope this helps ! Feel free to give clarification on your problem if you want more specific answers.
While I couldn't resolve this issue in SQL, I used Tableau via a FIXED LOD to aggregate the data passed duplicates so the end user could visualize the output with accuracy. Not ideal, but the SQL route wasn't make sense.
I have the following problem.
I have a table Entries that contains 2 columns:
EntryID - unique identifier
Name - some name
I have another EntriesMapping table (many to many mapping table) that contains 2 columns :
EntryID that refers to the EntryID of the Entries table
PartID that refers to a PartID in a seprate Parts table.
I need to write a SP that will return all data from Entries table, but for each row in the Entries table I want to provide a list of all PartID's that are registered in the EntriesMapping table.
My question is how do I best approach the deisgn of the solution to this, given that the results of the SP would regularly be processed by an app so performance is quite important.
1.
Do I write a SP that will select multiple rows per entry - where if there are more than one PartID's registered for a given entry - I will return multiple rows each having the same EntryID and Name but different PartID's
OR
2.
Do I write a SP that will select 1 row per entry in the Entries table, and have a field that is a string/xml/json that contains all the different PartID's.
OR
3. There is some other solution that I am not thinking of?
Solution 1 seems to me to be the better way to go, but I will be passing lots of repeating data.
Solution 2 wont pass extra data, but the string/json/xml would need to be processed additionally, resuling in larger cpu time per item.
PS: I feel like this is quite a common problem to solve, but I was unable to find any resource that can provide common solutions or some pros/cons to different approaches.
I think you need simple JOIN:
SELECT e.EntryId, e.Name, em.PartId
FROM Entries e
JOIN EntriesMapping em ON e.EntryId = em.EntryId
This will return what you want, no need for stored procedure for that.
Given those relationships, how do I limit the choice of Leader in a given record in GroupResults to only those StudentResults.IDs, which have Class&Group set to the same value as in the ID field of that record without creating forms and using VBA?
If I assign SELECT StudentResults.ID, StudentResults.FullName FROM StudentResults; to the Row Source in [Leader], like this ,
I get all the records in the table to choose from, regardless of the [Class&Group] field value, like this .
How do I restrict the assignable records to only those that belong to the corresponding group?
I'd spent a very long time trying to find a way to run a parametrised SQL query to pass the [Class&Group] to the WHERE clause, but eventually had to give up.
Thank you very much for your help!
P.S. I do realise that this may or may not be more of an ms-access, rather than SQL question.
Tables are not designed to be user interfaces. Conditional comboboxes, validation, etc. work best on forms. Comboxbox lookup dropdowns are more an Access GUI convenience to show parent table indicators for key number values.
When queries are then run from such tables, these drop downs fields show to help us humans who naturally understand names and indicators rather than integer primary/foreign keys. So instead of Student: 1, we see Student: John Doe. In fact, such table field drop downs even helps generate the same comboboxes on Access forms and reports in advance to avoid the designer in building them upon clicking the form icons on ribbon.
However, for your needs consider adjusting combobox by showing the [Class&Group] field so the user can see or match the group of specific Leader with appropriate one for current record in Class column. See adjusted query and column count/heads.
Row Source: SELECT s.ID, s.[Class&Group], s.FullName FROM StudentResults s
Bound Column: 1
Column Count: 3
Column Heads: Yes
Also, if you want the Leader name to always show when table or query is opened instead of ID, reverse the order in query and change bound column:
Row Source: SELECT s.FullName, s.[Class&Group], s.ID FROM StudentResults
Bound Column: 3
Column Count: 3
Column Heads: Yes
I've 250+ columns in customer table. As per my process, there should be only one row per customer however I've found few customers who are having more than one entry in the table
After running distinct on entire table for that customer it still returns two rows for me. I suspect one of column may be suffixed with space / junk from source tables resulting two rows of same information.
select distinct * from ( select * from customer_table where custoemr = '123' ) a;
Above query returns two rows. If you see with naked eye to results there is not difference in any of column.
I can identify which column is causing duplicates if I run query every time for each column with distinct but thinking that would be very manual task for 250+ columns.
This sounds like very dumb question but kind of stuck here. Please suggest if you have any better way to identify this, thank you.
Solving this one-time issue with sql is too much effort. Simply copy-paste to excel, transpose data into columns and use some simple function like "if a==b then 1 else 0".
I am trying to calculate hours flowing in and out of a cost center. When the cost center lends out an employee for an hour it's +1 and when they borrow an employee for an hour it's -1.
Right now I'm using a query that says
select
columns
from dbo.table
where EmployeeCostCenter <> ProjectCostCenter
So when ProjectCostCenter = ID_CostCenter it returns +HoursQuantity.
Then I update ID_CostCenter = EmployeeCostCenter then where ID_CostCenter = EmployeeCostCenter to take -HoursQuantity.
That works fine. The problem is when I import it to Spotfire I can't filter on the main table even after I added the table relations. Can anyone explain why?
I can upload the actual code if needed, but I use 4 queries and a couple of them are quite lengthy. The main table, a temp table to calculate incoming hours, and a temp table to calculate outgoing hours are the only ones involved in this problem I think.
(moved to answer to avoid lengthy discussion)
Essentially, data relations are used to populate filtering / marking between different data-sets. Just like in RDBMS, the relation is what Spotfire uses as the link between dataset. Essentially it's the same as the column or columns you join on. Thus, any column that you wish to filter in TableA and have the result set limited in TableB (or visa versa) must be a relation.
Column matches aren't related columns, but are associated for aggregations, category axis, etc within each visualization. So if TableA has "amount" and TableB has "amount debit" and you wanted to use both of these in an expression, say Sum([TableA].[amount],[TableB].[amount debit]), they would need to be matched in order to not produce erroneous results.
Lastly, once you set up your relations, you should check your filter panel to set up how you want the filtering to work. You can have the rows included, excluded, or ignored all together. Here is a link explaining that.