How can I display enum string key instead of integer value in SQL query? - sql

I am pretty new to SQL. I assume this is fairly simple, but I haven't been able to find a straightforward answer online.
I am writing a simple SQL query to group database records by an enum column, and display the count of each value. It works fine, but the output is displaying the enum integer, where I want it to display the string key of that enum value.
Here is an example of the SQL query:
SELECT COUNT(a.sound) as "Sound Count", a.sound
FROM animals a
GROUP BY a.sound
Here is the enum definition:
enum sound: {
bark: 0,
meow: 1,
moo: 2
}
And here is the output of the query:
Sound Count Sound
2 0
4 1
3 2
Whereas I really want:
Sound Count Sound
2 bark
4 meow
3 moo

You are asking the DB for info using SQL and so it will not have any knowledge of your Rails enums. You need to use Rails to make the query:
Animals.all.group(:sound).count(:sound)
=> {"bark"=>2, "meow"=>4, "moo"=>3}
For a pure sql answer with Postgresql:
SELECT temp.sound_count,
CASE
when temp.sound = 0 then 'bark'
when temp.sound = 1 then 'meow'
when temp.sound = 2 then 'moo'
END
AS my_sound
FROM (SELECT COUNT(s.sound) as sound_count, a.sound from animals a
GOUP BY a.sound)
AS temp;

If you're not working on a legacy database and are able to change the schema, then I would suggest not using an integer backed enum. Using a string backed enum will make your database readable without the application code. Then when you add new values to your code, you don't need to document what the integers mean.
Instead of defining the enum as you do, define it as strings:
enum sound: {
bark: 'bark',
meow: 'meow',
moo: 'moo'
}
And make sure that the column in the database is also a string.
Now you get all the benefits of enum without the hassle of integers in the database. Your query will also work as-is and produce the result you asked for.
As long as the column is indexed, it's basically just as fast to query as an integer. It will just take a few more bytes of space.
If you want to enforce values on the database level, a postgres enum could also be considered.

Related

SQL count query number of lines per row

I am trying to count the amount of urls we have in field in sql I have googled but cannot find anything !
So for example this could be in field "url" row 1 / id 1
url/32432
url/32434
So for example this could be field "url" in row 2 / id 2
url/32432
url/32488
url/32477
So if you were to run the query the count would be 5. There is no comma in between them, only space.
Kind Regards
Scott
This is a very bad layout for data. If you have multiple urls per id, then they should be stored as separate rows in another table.
But, sometimes we are stuck with other people's bad design decisions. You can do something like this:
select (length(replace(urls, 'url', 'urlx')) - length(urls)) as num_urls
Note that the specific functions for length() and replace() might vary, depending on the database.

Checking Range in Comma Separated Values [SQL Server 2008]

I have a table with following structure
ID FirstName LastName CollectedNumbers
1 A B 10,11,15,55
2 C D 101,132,111
I want a boolean value based on CollectedNumber Range. e.g. If CollectedNumbers are between 1 and 100 then True if Over 100 then False. Can anyone Suggest what would be best way to accomplish this. Collected Numbers won't be sorted always.
It so happens that you have a pretty simple way to see if values are 100 or over in the list. If such a value exists, then there are at least three characters between the commas. If the numbers are never more than 999, you could do:
select (case when ','+CollectedNumbers+',' not like '%,[0-9][0-9][0-9]%' then 1
else 0
end) as booleanflag
This happens to work for the break point of 100. It is obviously not a general solution. The best solution would be to use a junction table with one row per id and CollectedNumber.
Just make a function, which will return true/False, in the database which will convert the string values(10,11,15,55) into a table and call that function in the Selection of the Query like this
Select
ID, FirstName, LastName,
dbo.fncCollectedNumbersResult(stringvalue) as Result
from yourTableName
I think the easiest you can do is build a C# function and use the builtin sqlclr to load it as a custom function you can then call.
Inside the C# function, you can then sort your numbers and make simple logic to return your true/false.

Returning the first X records in a postgresql query with a unique field

Ok so I'm having a bit of a learning moment here and after figuring out A way to get this to work, I'm curious if anyone with a bit more postgres experience could help me figure out a way to do this without doing a whole lotta behind the scene rails stuff (or doing a single query for each item i'm trying to get)... now for an explaination:
Say I have 1000 records, we'll call them "Instances", in the database that have these fields:
id
user_id
other_id
I want to create a method that I can call that pulls in 10 instances that all have a unique other_id field, in plain english (I realize this won't work :) ):
Select * from instances where user_id = 3 and other_id is unique limit 10
So instead of pulling in an array of 10 instances where user_id is 3 and you can get multiple instances with the other_id is 5, I want to be able to run a map function on those 10 instances and get back something like [1,2,3,4,5,6,7,8,9,10].
In theory, I can probably do one of two things currently, though I'm trying to avoid them:
Store an array of id's and do individual calls making sure the next call says "not in this array". The problem here is I'm doing 10 individual db queries.
Pull in a large chunk of say, 50 instances and sorting through them in ruby-land to find 10 unique ones. This wouldn't allow me to take advantage of any optimizations already done in the database and I'd also run the risk of doing a query for 50 items that don't have 10 unique other_id's and I'd be stuck with those unless I did another query.
Anyways, hoping someone may be able to tell me I'm overlooking an easy option :) I know this is kind of optimizing before it's really needed but this function is going to be run over and over and over again so I figure it's not a waste of time right now.
For the record, I'm using Ruby 1.9.3, Rails 3.2.13, and Postgresql (Heroku)
Thanks!
EDIT: Just wanted to give an example of a function that technically DOES work (and is number 1 above)
def getInstances(limit, user)
out_of_instances = false
available = []
other_ids = [-1] # added -1 to avoid submitting a NULL query
until other_ids.length == limit || out_of_instances == true
instance = Instance.where("user_id IS ? AND other_id <> ALL (ARRAY[?])", user.id, other_ids).limit(1)
if instance != []
available << instance.first
other_ids << instance.first.other_id
else
out_of_instances = true
end
end
end
And you would run:
getInstances(10, current_user)
While this works, it's not ideal because it's leading to 10 separate queries every time it's called :(
In a single SQL query, it can be achieved easily with SELECT DISTINCT ON... which is a PostgreSQL-specific feature.
See http://www.postgresql.org/docs/current/static/sql-select.html
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of
each set of rows where the given expressions evaluate to equal. The
DISTINCT ON expressions are interpreted using the same rules as for
ORDER BY (see above). Note that the "first row" of each set is
unpredictable unless ORDER BY is used to ensure that the desired row
appears first
With your example:
SELECT DISTINCT ON (other_id) *
FROM instances
WHERE user_id = 3
ORDER BY other_id LIMIT 10

search within an array with a condition

I have two array I'm trying to compare at many levels. Both have the same structure with 3 "columns.
The first column contains the polygon's ID, the second a area type, and the third, the percentage of each area type for a polygone.
So, for many rows, it will compare, for example, ID : 1 Type : aaa % : 100
But for some elements, I have many rows for the same ID. For example, I'll have ID 2, Type aaa, 25% --- ID 2, type bbb, 25% --- ID 2, type ccc, 50%. And in the second array, I'll have ID 2, Type aaa, 25% --- ID 2, type bbb, 10% --- ID 2, type eee, 38% --- ID 2, type fff, 27%.
here's a visual example..
So, my function has to compare these two array and send me an email if there are differences.
(I wont show you the real code because there are 811 lines). The first "if" condition is
if array1.id = array2.id Then
if array1.type = array2.type Then
if array1.percent = array2.percent Then
zone_verification = True
Else
zone_verification = False
The probleme is because there are more than 50 000 rows in each array. So when I run the function, for each "array1.id", the function search through 50 000 rows in array2. 50 000 searchs for 50 000 rows.. it's pretty long to run!
I'm looking for something to get it running faster. How could I get my search more specific. Example : I have many id "2" in the array1. If there are many id "2" in the array2, find it, and push all the array2.id = 3 in a "sub array" or something like that, and search in these specific rows. So I'll have just X rows in array1 to compare with X rows in array 2, not with 50 000. and when each "id 2" in array1 is done, do the same thing for "id 4".. and for "id 5"...
Hope it's clear. it's almost the first time I use VB.net, and I have this big function to get running.
Thanks
EDIT
Here's what I wanna do.
I have two different layers in a geospatial database. Both layers have the same structure. They are a "spatial join" of the land parcels (55 000), and the land use layer. The first layer is the current one, and the second layer is the next one we'll use after 2015.
So I have, for each "land parcel" the percentage of each land use. So, for a "land parcel" (ID 7580-80-2532, I can have 50% of farming use (TYPE FAR-23), and 50% of residantial use (RES-112). In the first array, I'll have 2 rows with the same ID (7580-80-2532), but each one will have a different type (FAR-23, RES-112) and a different %.
In the second layer, the same the municipal zoning (land use) has changed. So the same "land parcel" will now be 40% of residential use (RES-112), 20% of commercial (COM-54) and 40% of a new farming use (FAR-33).
So, I wanna know if there are some differences. Some land parcels will be exactly the same. Some parcels will keep the same land use, but not the same percentage of each. But for some land parcel, there will be more or less land use types with different percentage of each.
I want this script to compare these two layers and send me an email when there are differences between these two layers for the same land parcel ID.
The script is already working, but it takes too much time.
The probleme is, I think, the script go through all array2 for each row in array 1.
What I want is when there are more than 1 rows with the same ID in array1, take only this ID in both arrays.
Maybe if I order them by IDs, I could write a condition. kind of "when you find what you're looking for, stop searching when you'll find a different value?
It's hard to explain it clearly because I've been using VB since last week.. And english isn't my first language! ;)
If you just want to find out if there are any differences between the first and second array, you could do:
Dim diff = New HashSet(of Polygon)(array1)
diff.SymmetricExceptWith(array2)
diff will contain any Polygon which is unique to array1 or array2. If you want to do other types of comparisons, maybe you should explain what you're trying to do exactly.
UPDATE:
You could use grouping and lookups like this:
'Create lookup with first array, for fast access by ID
Dim lookupByID = array1.ToLookup(Function(p) p.id)
'Loop through each group of items with same ID in array2
For Each secondArrayValues in array2.GroupBy(Function(p) p.id)
Dim currentID As Integer = secondArrayValues.Key 'Current ID is the grouping key
'Retrieve values with same ID in array1
'Use a hashset to easily compare for equality
Dim firstArrayValues As New HashSet(of Polygon)(lookupByID(currentID))
'Check for differences between the two sets of data, for this ID
If Not firstArrayValues.SetEquals(secondArrayValues) Then
'Data has changed, do something
Console.WriteLine("Differences for ID " & currentID)
End If
Next
I am answering this question based on the first part that you wrote (that is without the EDIT section). The correct answer should explain a good algorithm but I am suggesting you to use DB capabilities because they have optimized many queries for these purpose.
Put all the records in DB two tables - O(n) time ... If the records are static you dont need to perform this step every time.
Table 1
id type percent
Table 2
id type percent
Then use the DB query, some thing like this
select count(*) from table1 t1, table2 t2 where t1.id!=t2.id and t1.type!=t2.type
(you can use some better queries, what I am trying to say is give the control to DB to perform this operation)
retrieve the result in your code and perform the necessary operation.
EDIT
1) You can sort them in O(n logn) time based on ID + type + Percent and then perform binary search.
2) Store the first record in hash map with appropriate key - could be ID only or ID+type
this will take O(n) time and searching ,if key is correct, will take constant time.
You need to define a structure to store this data. We'll store all the data in a LandParcel class, which will have a HashSet<ParcelData>
public class ParcelData
{
public ParcelType Type { get; set; } // This can be an enum, string, etc.
public int Percent { get; set; }
// Redefine Equals and GetHashCode conveniently
}
public class LandParcel
{
public ID Id { get; set; } // Whatever the type of the ID is...
public HashSet<ParcelData> Data { get; set; }
}
Now you have to build your data structure, with something like this:
Dictionary<ID, LandParcel> data1 = new ....
foreach (var item in array1)
{
LandParcel p;
if (!data1.TryGetValue(item.id, out p)
data1[item.id] = p = new LandParcel(id);
// Can this data be repeated?
p.Data.Add(new ParcelData(item.type, item.percent));
}
You do the same with a data2 dictionary for the second array. Now you iterate for all items in data1 and compare them with the item with the same id for data2.
foreach (var parcel2 in data2.Values)
{
var parcel1 = data1[parcel2.ID]; // Beware with exceptions here !!!
if (!parcel1.Data.SetEquals(parcel2.Data))
// You have different parcels
}
(Now that I look at it, we are practically doing a small database query here, kind of smelly code ...)
Sorry for the C# code since I don't really feel so comfortable with VB, but it should be fairly straightforward.

Order by a field containing Numbers and Letters

I need to extract data from an existing Padadox database under Delphi XE2 (yes, i more than 10 years divide them...).
i need to order the result depending on a field (id in the example) containing values such as : '1', '2 a', '100', '1 b', '50 bis'... and get this :
- 1
- 1 b
- 2 a
- 50 bis
- 100
maybe something like that could do it, but those keywords don't exist :
SELECT id, TRIM(TRIM(ALPHA FROM id)) as generated, TRIM(TRIM(NUMBER FROM id)) as generatedbis, etc
FROM "my.db"
WHERE ...
ORDER BY generated, generatedbis
how could i achieve such ordering with paradox... ?
Try this:
SELECT id, CAST('0' + id AS INTEGER) A
FROM "my.db"
ORDER BY A, id
These ideas spring to mind:
create a sort function in delphi that does the sort client-side, using a comparison/mapping function that rearranges the string into something that is compariable, maybe lexographically.
add a column to the table whose data you wish to sort, that contains a modification of the values that can be compared with a standard string comparison and thus will work with ORDER BY
add a stored function to paradox that does the modification of the values, and use this function in the ORDER BY clause.
by modification, I mean something like, separate the string into components, and re-join them with each component right-padded with enough spaces so that all of the components are in the same position in the string. This will only work reliably if you can say with confidence that for each of the components, no value will exceed a certain length in the database.
I am making these suggestions little/no knowledge of paradox or delphi, so you will have to take my suggestions with a grain of salt.