How to select with bitwise flag values in SQL

How to select with bitwise flag values in SQL - sql

I have two tables in a SQL Server DB. One table BusinessOperations has various information about this business object, the other table OperationType is purely a bitwise flag table that looks like this:
| ID | Type | BitFlag |
| 1 | Basic-A | -2 |
| 2 | Basic | -1 |
| 3 | Type A | 0001 |
| 4 | Type B | 0002 |
| 5 | Type C | 0004 |
| 6 | Type D | 0008 |
| 7 | Type E | 0016 |
| 8 | Type F | 0032 |
The BitFlag column is a varchar column, the bitflags were inserted as '0001' as an example. In the BusinessOperations table, there's a column where the application that uses these tables updates it based on what is selected in the application's UI. As an example, I have one type which has the Basic,Type A, and Type B types selected. The column value in BusinessOperations is 3.
Based on this, I am trying to write a query which will show me something like this:
| ID | Name | Description | OperationType |
| 1 | Test | Test | Basic, Type A, Type B |
Here is the actual layout of the BusinessOperations table (Basic-A and Basic are bit columns:
| ID | Name | Description | Basic-A | Basic | OperationType |
| 1 | Test | Test | 0 | 1 | 3 |
There is nothing that relates these two tables to each other, so I cannot perform a join. I am very inexperienced with bitwise operations and am at a loss on how exactly to structure my select query which is being used to analyze this data. I feel like it needs a STUFF or CASE, but I don't know how I can get this to just show the types and not just the resultant BitFlag.
SELECT ID, Name, Description, OperationType
FROM OperationType
ORDER BY ID

Since you're storing the flag in OperationType as a VARCHAR, the first thing you need to do to is CONVERT or CAST the string to a number so we can do proper bitwise comparisons. I'm slightly unfamiliar with SQL Server, but you may need to remove the leading zeroes before the cast. Thus, the OperationType column in our desired SQL will look something like
CONVERT(INT, BitFlag)
Then, comparing that to our OperationType column would look something like
CONVERT(INT, BitFlag) & OperationType
The full query would look something like (forgive my lack of SQL Server expertise again):
SELECT bo.ID, bo.Name, bo.Description, ot.Type
FROM BusinessOperations AS bo
JOIN OperationType AS ot
ON CONVERT(INT, ot.BitFlag) & OperationType <> 0
The above query will effectively get you a list of the OperationTypes. If you absolutely need them on one line, see other answers to learn how to emulate something like GROUP_CONCAT in SQL Server. Disclaimer: Joining on a bitmask gives no guarantee of performance.
The other problem this answer does not solve is that of your legacy Basic and Basic-A fields. Personally, I'd do one of two things:
Remove them from the OperationType table and have the application tack the two on, based on the Basic and Basic-A columns as appropriate.
Put Basic and Basic-A as their own, positive flags in the OperationType table, and have the application populate the legacy columns as well as the OperationType column as appropriate.
As Aaron Bertrand has said in the comments, this really isn't an issue for Bitmasking at all. Having a many-many table that associates BusinessOperations.ID to OperationType.ID would solve all your problems in a much better way.

In the BusinessOperations table the Basic-A and Basic field are bit fields which is just another way of saying the value can only be a 1 or 0. Think of it like a boolean value True/False. So, in your query you can check each of those to determine whether to include 'Basic-A' and 'Basic' or not.
The OperationType is probably an id which you can lookup in the OperationsType table to get the Type and BitFlag. Without understanding your data completely it looks as if you could do a join for that part. Hopefully that is in the right general direction. If not, let me know.

Related

mapping of areas with multiple users

I have areas like sector 1, sector 1 a, sector 1 b, sector 1 c and multiple cable operators who are working in either full sector(i.e sector 1) or any of the sub sectors. I have created table of cable operators and want to map them with areas. If I set up area table like sector 1, sector 1 a, sector 1 b, sector 1 c each with their own Primary Key then how can I reference these sectors in single row of cable operators provided that we have to get the cable operators working in that particular sector.
My table structures are as follows:
Operators
| id | name
| 1 | 'abc'
| 2 | 'def'
| 3 | 'ghi'
areas
| id | name
| 1 | 'sector 1'
| 2 | 'sector 1a'
| 3 | 'sector 1b'
| 4 | 'sector 1c'
| 5 | 'sector 1d'
| 6 | 'sector 2'
| 7 | 'sector 2a'
| 8 | 'sector 2b'
| 9 | 'sector 2c'
| 10 | 'sector 2d'
I have operatorsareas table where I have map operators with areas as follows:
operatorsareas
| op_id | area_id
| 1 | 1
| 2 | 1
| 3 | 1
| 1 | 7
| 2 | 8
| 3 | 7
Now I have used this query which gives me no result:
select o.id, o.name from operator as o
where not exists(select * from areas a where id in (1,7,8) and not exists(select * from operatorareas as oa where oa.operatorid=o.id
and oa.areaid = a.id))
I have taken the reference of following link:
SQL query through an intermediate table
I need a guidance regarding structuring of the tables.

Initial Problem
Your Sector/Subsector designation breaks 1NF:
Each domain [column] must be Atomic wrt to the [datatypes available in the] platform.
That is a gross Normalisation error, which will have horrendous consequences downstream. The correction is:
Sector is one datum, one column, eg. Sector 1
SubSector is a separate datum, a separate column, eg. a, b, c
The Data • What is it ?
I need a guidance regarding structuring of the tables.
Ok. But what the data actually means, is not at all clear.
From the little info you have given, the following Predicates can be derived:
Each Operator is assigned to 0-or-1 Area
Assumption: an Operator cannot be in more than one place at a time
Assumption: an Operator may not be assigned
Each [assigned] Area is one of { Sector | SubSector | Unassigned }
AreaType is the Discriminator
Each Sector comprises 0-to-n SubSectors
Each Sector is occupied by 0-to-n Operators
Each SubSector is occupied by 0-to-n Operators
Please check and ensure that each is true (otherwise the data model is garbage).
Relational Data Model
Assuming those Predicates are correct, the Normalised Relational data model is:
Subtype • Exclusive
operators are working in either full sector or any of the sub sectors
What you are seeking in Logic terms is an OR Gate, in Relational terms, it is an Exclusive Subtype
Refer to Subtype for full details on Subtype implementation.
Note • Notation
All my data models are rendered in IDEF1X, the Standard for modelling Relational databases since 1993
My IDEF1X Introduction is essential reading for beginners or novices, wrt a Relational database.
The Query • What is it ?
Now I have used this query which gives me no result
We do not know what result set you are attempting to obtain.
At this point, it does not appear to be related to the linked Question & Answer.
Please explain what result set you would like to obtain, in English. Hopefully observing the given data model.
Supplying the required SQL would then a simple matter.
Enjoy. Please feel free to ask specific questions.

Combining two rowsets in ADLA without join on clause

I've got two types of input files I'm loading into an ADLA job. In one, I've got a bunch of data (left) and in another, I've got a list of values that are important to me (right).
As an example here, let's say I'm using the following in my "left" rowset:
| ID | URL |
|----|-------------------------|
| 1 | https://www.google.com/ |
| 2 | https://www.yahoo.com/ |
| 3 | https://www.hotmail.com/|
I'll have something like the following in my right rowset:
| ID | Name | Regex | Exceptions | Other Lookup Val |
|----|-------|-------------|------------|------------------|
| 1 | ThisA | /[a-z]{3,}/ | abc | 091238 |
| 2 | ThatA | /[a-z]{3,}/ | xyz | lksdf9 |
| 3 | OtherA| /[a-z]{3,}/ | def | 098143 |
As each are loaded via an EXTRACT statement, both are in separate rowsets. Ideally, I'd like to be able to load all the values for both rowsets and loop through the right one to run a series of calculations against the left one to find a match per various business rules. Notably, there's no value to simply join on, nor is it a simple Regex evaluation, but rather something a bit more involved. Thus, the output might just look something like the "left" rowset:
| ID | URL |
|----|-------------------------|
| 1 | https://www.google.com/ |
| 3 | https://www.hotmail.com/|
Now, a COMBINER is the only UDO I see that accepts two rowsets, but the U-SQL syntax requires that I do some sort of join statement here. There's no common identifier between each of the rowsets though, so there's nothing to join on, which suddenly makes this seem less ideal. Of the attribute options defined at https://learn.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-u-sql-programmability-guide#use-user-defined-combiners, I'd like to specify this as a Full because I'd need each of the left values available to evaluate against each of the right ones, but again, no shared identifier to do this on.
I then tried to use a REDUCER that accepted an IRowset in the IReducer constructor as a parameter, then tried to just pass the rowset in from the U-SQL, but it didn't like that syntax.
Is there any way to perform this custom combining in a manner that doesn't require a JOIN ON clause?

It sounds like you may be able to use an IProcessor. This would allow you to analyze each row in the RIGHT set and add a column (with a value based on your business rules) that you can subsequently use to join to the LEFT set.
[Adding a bit more detail]: You could also do this twice, once for the left and once for the right to create an artificial join column, like row_number or some such.

Gather single rows from multiple tables in Microsoft Access

I have several tables in Microsoft Access 2013, all of which follow the same format of:
ID | Object | Person 1 | Person 2 | Person 3 |
ID | String | Yes/No | Yes/No | Yes/No |
What I would like to do is make a query where I put in a string value for each table and it prints out the entire row, with each string getting its own row, so it looks like:
ID Number | Object | Person 1...
Table 1 ID | Table 1 String | Table 1 Yes/No...
Table 2 ID | Table 2 String | Table 2 Yes/No...
Every time I try, though, it puts all the data into one extremely long row that's impossible to look at. All of my searching has only turned up people trying to do the exact opposite of what I'm doing, though, so I must be missing something obvious. Any tips?

SQL join two tables using value from one as column name for other

I'm a bit stumped on a query I need to write for work. I have the following two tables:
|===============Patterns==============|
|type | bucket_id | description |
|-----------------------|-------------|
|pattern a | 1 | Email |
|pattern b | 2 | Phone |
|==========Results============|
|id | buc_1 | buc_2 |
|-----------------------------|
|123 | pass | |
|124 | pass |fail |
In the results table, I can see that entity 124 failed a validation check in buc_2. Looking at the patterns table, I can see bucket 2 belongs to pattern b (bucket_id corresponds to the column name in the results table), so entity 124 failed phone validation. But how do I write a query that joins these two tables on the value of one of the columns? Limitations to how this query is going to be called will most likely prevent me from using any cursors.

Some crude solutions:
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_1" = 'fail' AND "bucket_id" = 1
union all
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_2" = 'fail' AND "bucket_id" = 2
Or, with a very probably better execution plan:
SELECT "id", "description" FROM
Results JOIN Patterns
ON "buc_1" = 'fail' AND "bucket_id" = 1
OR "buc_2" = 'fail' AND "bucket_id" = 2;
This will report all failure descriptions for each id having a fail case in bucket 1 or 2.
See http://sqlfiddle.com/#!4/a3eae/8 for a live example
That being said, the right solution would be probably to change your schema to something more manageable. Say by using an association table to store each failed test -- as you have in fact here a many to many relationship.

An other approach if you are using Oracle ≥ 11g, would be to use the UNPIVOT operation. This will translate columns to rows at query execution:
select * from Results
unpivot ("result" for "bucket_id" in ("buc_1" as 1, "buc_2" as 2))
join Patterns
using("bucket_id")
where "result" = 'fail';
Unfortunately, you still have to hard-code the various column names.
See http://sqlfiddle.com/#!4/a3eae/17

It looks to me that what you really want to know is the description(in your example Phone) of a Pattern entry given the condition that the bucket failed. Regardless of the specific example you have you want a solution that fulfills that condition, not just your particular example.
I agree with the comment above. Your bucket entries should be tuples(rows) and not arguments, and also you should share the ids on each table so you can actually join them. For example, Consider adding a bucket column and index their number then just add ONE result column to store the state. Like this:
|===============Patterns==============|
|type | bucket_id | description |
|-----------------------|-------------|
|pattern a | 1 | Email |
|pattern b | 2 | Phone |
|==========Results====================|
|entity_id | bucket_id |status |
|-------------------------------------|
|123 | 1 |pass |
|124 | 1 |pass |
|123 | 2 | |
|124 | 2 |fail |
1.-Use an Inner Join: http://www.w3schools.com/sql/sql_join_inner.asp and the WHERE clause to filter only those buckets that failed:
2.-Would this example help?
SELECT Patterns.type, Patterns.description, Results.entity_id,Results.status
INNER JOIN Results
ON
Patterns.bucket_id=Results.bucket_id
WHERE
Results.status=fail
Lastly, I would also add a primary_key column to each table to make sure indexing is faster for each unique combination.
Thanks!

SQL: Creating a common table from multiple similar tables

I have multiple databases on a server, each with a large table where most rows are identical across all databases. I'd like to move this table to a shared database and then have an override table in each application database which has the differences between the shared table and the original table.
The aim is to make updating and distributing the data easier as well as keeping database sizes down.
Problem constraints
The table is a hierarchical data store with date based validity.
table DATA (
ID int primary key,
CODE nvarchar,
PARENT_ID int foreign key references DATA(ID),
END_DATE datetime,
...
)
Each unique CODE in DATA may have a number of rows, but at most a single row where END_DATE is null or greater than the current time (a single valid row per CODE). New references are only made to valid rows.
Updating the shared database should not require anything to be run in application databases. This means any override tables are final once they have been generated.
Existing references to DATA.ID must point to the same CODE, but other columns do not need to be the same. This means any current rows can be invalidated if necessary and multiple occurrences of the same CODE may be combined.
PARENT_ID references must have same parent CODE before and after the split. The actual PARENT_ID value may change if necessary.
The shared table is updated regularly from an external source and these updates need to be reflected in each database's DATA. CODEs that do not appear in the external source can be thought of as invalid, new references to these will not be added.
Existing functionality will continue to use DATA, so the new view (or alternative) must be transparent. It may, however, contain more rows than the original provided earlier constraints are met.
New functionality will use the shared table directly.
Select performance is a concern, insert/update/delete is not.
The solution needs to support SQL Server 2008 R2.
Possible solution
-- in a single shared DB
DATA_SHARED (table)
-- in each app DB
DATA_SHARED (synonym to DATA_SHARED in shared DB)
DATA_OVERRIDE (table)
DATA (view of DATA_SHARED and DATA_OVERRIDE)
Take an existing DATA table to become DATA_SHARED.
Exclude IDs with more than one possible CODE so only rows common across all databases remain. These missing rows will be added back once the data is updated the first time.
Unfortunately every DATA_OVERRIDE will need all rows that differ in any table, not only rows that differ between DATA_SHARED and the previous DATA. There are several IDs that differ only in a single database, this causes all other databases to inflate. Ideas?
This solution causes DATA_SHARED to have a discontinuous ID space. It's a mild annoyance rather than a major issue, but worth noting.
edit: I should be able to keep all of the rows in DATA_SHARED, just invalidate them, then I only need to store differing rows in DATA_OVERRIDE.
I can't think of any situations where PARENT_ID references become invalid, thoughts?
Before:
DB1.DATA
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
2 | A1 | 1 | 2020
3 | A2 | 1 | 2010
DB2.DATA
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
2 | X | NULL | NULL
3 | A2 | 1 | 2010
4 | X1 | 2 | NULL
5 | A1 | 1 | 2020
After initial processing (DATA_SHARED created from DB1.DATA):
SHARED.DATA_SHARED
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
3 | A2 | 1 | 2010
-- END_DATE is omitted from DATA_OVERRIDE as every row is implicitly invalid
DB1.DATA_OVERRIDE
ID | CODE | PARENT_ID
2 | A1 | 1
DB2.DATA_OVERRIDE
ID | CODE | PARENT_ID
2 | X |
4 | X1 | 2
5 | A1 | 1
After update from external data where A1 exists in source but X and X1 don't:
SHARED.DATA_SHARED
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
3 | A2 | 1 | 2010
6 | A1 | 1 | 2020
edit: The DATA view would be something like:
select D.ID, ...
from DATA D
left join DATA_OVERRIDE O on D.ID = O.ID
where O.ID is null
union all
select ID, ...
from DATA_OVERRIDE
order by ID
Given the small number of rows in DATA_OVERRIDE, performance is good enough.
Alternatives
I also considered an approach where instead of DATA_SHARED sharing IDs with the original DATA, there would be mapping tables to link DATA.IDs to DATA_SHARED.IDs. This would mean DATA_SHARED would have a much cleaner ID-space and there could be less data duplication, but the DATA view would require some fairly heavy joins. The additional complexity is also a significant negative.
Conclusion
Thank you for your time if you made it all the way to the end, this question ended up quite long as I was thinking it through as I wrote it. Any suggestions or comments would be appreciated.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas