Get models with distinct attribute ActiveRecord - sql

I have a bunch of records in my database which all have the same Title but different Locations. Once I filter by within a location boundary, I want to filter out ones with the same Title. Is there an ActiveRecord way to do this? I know about select, but that will only return titles, and I actually need the entire record.
So I have a Business which has a Title. If I select all of the businesses within a given lat/long boundary, multiple instances with the same name (say, Subway) will be returned. I want to limit the result to 10.
In English: Given me ten records (the entire record, not just certain columns) where every title is unique amongst the ten returned.

You can simply use .first, i.e.
Venue.where(name: "Subway").first
If you need more than one element, pass a parameter to first:
Venue.where(name: "Subway").first(10)
To select one entry per distinct value in some column, you can use .group("column_name"):
Venue.where(some_condition).group("name")

ModelName.where(title: "Building")
If you provide a more specific question, I'll provide a more specific answer...

Related

Return first 'unsorted' join in Oracle SQL

I have a table 'ACCOUNTS', with fields ACCTNO and ACPARENT. One account can be the parent of another. One account can have many children.
It's been discovered that certain external processes are using the 'first child' in certain reports and outputs - but there's no actual 'reason' for any particular child to be 'first', just an unintended bug in the code.
First step in untangling this - I need a query, that can be re-run (but not often, so optimisation is not really a factor) that will identify, for all accounts that are parents, what their 'first child' is.
Problem - the 'first child' isn't necessarily anything to do with record ID. If I run the following query, for example:
SELECT ACCTNO FROM ACCOUNTS WHERE ACPARENT = '80005217';
I get a result of:
ACCTNO
______
80007325
80007310
80007315
80007298
I can absolutely, 100% confirm that for this particular example, account 80007325 is the account ID being used as the 'first child'.
On the flipside, if I run a naive query of:
SELECT A1.ACCTNO, A2.ACCTNO AS CHILDACCOUNT FROM ACCOUNTS A1
INNER JOIN ACCOUNTS A2 ON A1.ACCTNO = A2.ACPARENT
WHERE A1.ACCTNO IN
(SELECT ACPARENT FROM ACCOUNTS);
then if I scroll down to where 80005217 is the parent account, I see the following list:
CHILDACCOUNT
______
80007298
80007310
80007315
80007325
It's sorted, even though it's exactly not what I want.
Is there a query that will get me a list of what I want in a single query? A list of all parent accounts, and their 'first child' as returned by SQL unsorted?
To guarantee records coming in a fixed order we must provide the database with sort criteria in the ORDER BY clause. If there is no attribute which defines "first-ness" then no guarantee is possible. Without an ORDER BY clause the records are essentially in an uncontrolled order, although because of
database internals they often fall into some kind of pattern.
So, what makes account 80007325 the first child WHERE ACPARENT = '80005217'? Clearly not numerical order. Is there some other criterion? Date created? A flag column? Seems like you need to talk to your users. Do they really care which records come first? All the time or just in some specific report?
If your users cannot specify the criteria there's not much you can do...
...although I might be tempted to sort CHILDACCOUNT numerically by ACCTNO whenever it is displayed. At least that would provide consistency, and the users will get used to it.

Verify if parent and child , Heirarchy, hql

I have a case where I am given 2 IDs and there are 2 columns,assuming names'subj'and 'obj' in a table.If the Id's dont match in a single row, then i have to take the obj value and search for its entry in some other row in the subj column and then try to match the object iteratively. the search ends when there is no subject entry for a particular object. There is no with clause in hql and hence this question.
Example lets say i am given 1,100. Then i have to search for 1 and then get its object entry, if it is not 100 and lets say it is 20, i have to take that 20 and search for 20,100 , and once again it is not 100 in object entry, i have to repeat the process. This is possible in sql, but since there is no with clause in HQL i need suggestions.
I can always do it in the application but i am looking for another answer! The search ends when there is no corresponding subject entry for an object entry, or when it matches.
You need recursion for this, which is supported by some database implementations, but not by hql
HQL recursion, how do I do this?

Solr/Lucene result field term count

I am using solr to do a search. As result I get back a set of fields. One of the fields is "domains". The domain field is a many to many relationship in my database, so my docs contain an array of "domains" the are linked to.
What I want to do is, for each domain in the resultset, count how many times this "domain term" is found in the global result set.
How should I do this ?
You need to look at the Field collapsing feature.

Solr: Search in multiple fields BUT STOP if documents match was found

I want to search in multiple fields in Solr.
(In know the concept of the copy-fields and I know the (e)dismax search handler.)
So I have an orderd list of fields, I want the terms to be searched against.
1.) SKU
2.) Name
3.) Description
4.) Summary
and so on.
Now, when the query matches a term, let's say in the SKU field, I want this match and no further searches in the proceeding fields.
Only, if there are NO matches at all in the first field (SKU field), the second field (in this case "name") should be used and so on.
Is this possible with Solr?
Do I have to implement my own Lucene Search Handler for this?
Any advice is welcome!
Thank you,
Bernhard
I think your case requires executing 4 different searches. If you implement you very own SearchHandler you could avoid penalty of search result accumulation in 4 different request. Which means, you would send one query, and custom SearchHandler would execute 4 searches and prepare one result set.
If my guess is right you want to rank the results based on the order of the fields. If so then you can just use standard query like
q=sku:(query)^4 OR name:(query)^3 OR description:(query)^2 OR summary:(query)
this will rank the results by the order of the fields.
Hope is helps.

How to design a database table structure for storing and retrieving search statistics?

I'm developing a website with a custom search function and I want to collect statistics on what the users search for.
It is not a full text search of the website content, but rather a search for companies with search modes like:
by company name
by area code
by provided services
...
How to design the database for storing statistics about the searches?
What information is most relevant and how should I query for them?
Well, it's dependent on how the different search modes work, but generally I would say that a table with 3 columns would work:
SearchType SearchValue Count
Whenever someone does a search, say they search for "Company Name: Initech", first query to see if there are any rows in the table with SearchType = "Company Name" (or whatever enum/id value you've given this search type) and SearchValue = "Initech". If there is already a row for this, UPDATE the row by incrementing the Count column. If there is not already a row for this search, insert a new one with a Count of 1.
By doing this, you'll have a fair amount of flexibility for querying it later. You can figure out what the most popular searches for each type are:
... ORDER BY Count DESC WHERE SearchType = 'Some Search Type'
You can figure out the most popular search types:
... GROUP BY SearchType ORDER BY SUM(Count) DESC
Etc.
This is a pretty general question but here's what I would do:
Option 1
If you want to strictly separate all three search types, then create a table for each. For company name, you could simply store the CompanyID (assuming your website is maintaining a list of companies) and a search count. For area code, store the area code and a search count. If the area code doesn't exist, insert it. Provided services is most dependent on your setup. The most general way would be to store key words and a search count, again inserting if not already there.
Optionally, you could store search date information as well. As an example, you'd have a table with Provided Services Keyword and a unique ID. You'd have another table with an FK to that ID and a SearchDate. That way you could make sense of the data over time while minimizing storage.
Option 2
Treat all searches the same. One table with a Keyword column and a count column, incorporating SearchDate if needed.
You may want to check this:
http://www.microsoft.com/sqlserver/2005/en/us/express-starter-schemas.aspx