DQL query to return all files in a Cabinet in Documentum? - sql

I want to retrieve all the files from a cabinet (called 'Wombat Insurance Co'). Currently I am using this DQL query:
select r_object_id, object_name from dm_document(all)
where folder('/Wombat Insurance Co', descend);
This is ok except it only returns a maximum of 100 results. If there are 5000 files in the cabinet I want to get all 5000 results. Is there a way to use pagination to get all the results?
I have tried this query:
select r_object_id, object_name from dm_document(all)
where folder('/Wombat Insurance Co', descend)
ENABLE (RETURN_RANGE 0 100 'r_object_id DESC');
with the intention of getting results in 100 file increments, but this query gives me an error when I try to execute it. The error says this:
com.emc.documentum.fs.services.core.CoreServiceException: "QUERY" action failed.
java.lang.Exception: [DM_QUERY2_E_UNRECOGNIZED_HINT]error:
"RETURN_RANGE is an unknown hint or is being used incorrectly."
I think I am using the RETURN_RANGE hint correctly, but maybe I'm not. Any help would be appreciated!
I have also tried using the hint ENABLE(FETCH_ALL_RESULTS 0) but this still only returns a maximum of 100 results.
To clarify, my question is: how can I get all the files from a cabinet?

You have already accepted an answer which is using DFS.
Since your are playing with DFC, these information might help you.
DFS:
If you are using DFS, you have to aware about the number of concurrent sessions that you can consume with DFS.
I think it is 100 or 150.
DFC:
Actually there is a limit that you can fetch via DFC (I'm not sure with DFS).
Go to your DFC application(webtop or da or anything) and check the dfc.properties file.
# Maximum number of results to retrieve by a query search.
# min value: 1, max value: 10000000
#
dfc.search.max_results = 100
# Maximum number of results to retrieve per source by a query search.
# min value: 1, max value: 10000000
#
dfc.search.max_results_per_source = 400
dfc.properties.full or similar file is there and you can verify these values according to your system.
And I'm talking about the ContentServer side, not the client side dfc.properties file.
If you use ENABLE (RETURN_TOP) hint with DFC, there are 2 ways to fetch the results from the ContentServer.
Object based
Row based
You have to configure this by using the parameter return_top_results_row_based in the server.ini file.
All of these changes for the documentum server side, not for your DFC/DQL client.

Aha, I've figured it out. Using DFS with Java (an abstraction layer on top of DFC) you can set the starting index for query results:
String queryStr = "select r_object_id, object_name from dm_document(all)
where folder('/Wombat Insurance Co', descend);"
PassthroughQuery query = new PassthroughQuery();
query.setQueryString(queryStr);
query.addRepository(repositoryStr);
QueryExecution queryEx = new QueryExecution();
queryEx.setCacheStrategyType(CacheStrategyType.DEFAULT_CACHE_STRATEGY);
queryEx.setStartingIndex(currentIndex); // set start index here
OperationOptions operationOptions = null;
// will return 100 results starting from currentIndex
QueryResult queryResult = queryService.execute(query, queryEx, operationOptions);
You can just increment the currentIndex variable to get all results.

Well, the hint is being used incorrectly. Start with 1, not 0.
There is no built-in limit in DQL itself. All results are returned by default. The reason you get only 100 results must have something to do with the way you're using DFC (or whichever other client you are using). Using IDfCollection in the following way will surely return everything:
IDfQuery query = new DfQuery("SELECT r_object_id, object_name "
+ "FROM dm_document(all) WHERE FOLDER('/System', DESCEND)");
IDfCollection coll = query.execute(session, IDfQuery.DF_READ_QUERY);
int i = 0;
while (coll.next()) i++;
System.out.println("Number of results: " + i);
In a test environment (CS 6.7 SP1 x64, MS SQL), this outputs:
Number of results: 37162
Now, there's proof. Using paging is however a good idea if you want to improve the overall performance in your application. As mentioned, start counting with the number 1:
ENABLE(RETURN_RANGE 1 100 'r_object_id DESC')
This way of paging requires that sorting be specified in the hint rather than as a DQL statement. If all you want is the first 100 records, try this hint instead:
ENABLE(RETURN_TOP 100)
In this case sorting with ORDER BY will work as you'd expect.
Lastly, note that adding (all) will not only find all documents matching the specified qualification, but all versions of every document. If this was your intention, that's fine.

I've worked with DFC API (with Java) for a while but I don't remember any default limit on queries, IIRC we've always got all of the documents, there weren't any limit. Actually (according to my notes) we have to set the limit explicitly with, for example, enable (return_top 2000). (As far I know the syntax might be depend on the DBMS behind EMC Documentum.)
Just a guess: check your dfc.properties file.

Related

Identify Django queryset from SQL logs

I use Django 1.8.17 (I know it's not so young anymore).
I have logged slow requests on PostGres for more than one minute.
I have a lot of trouble finding the Queryset to which the SQL query listed in the logs belongs.
Is there an identifier that could be added to the Queryset to find the associated SQL query in the Logs or a trick to easily identify it?
Here is an exemple of common Queryset almost impossible to identify as I have several similars ones.
Queryset:
Video.objects.filter(status='online').order_by('created')
LOGs:
duration: 1056.540 ms statement: SELECT "video"."id", "video"."title",
"video"."description", "video"."duration", "video"."html_description",
"video"."niche_id", "video"."owner_id", "video"."views",
"video"."rating" FROM "video" WHERE "video"."status" = 'online'
ORDER BY "video"."created"
Desired LOGs:
duration: 1056.540 ms statement: SELECT "video"."id", "video"."title",
"video"."description", "video"."duration", "video"."html_description",
"video"."niche_id", "video"."owner_id", "video"."views",
"video"."rating" FROM "video" WHERE "video"."status" = 'online'
ORDER BY "video"."created" (ID=555)
Add middleware to log a warning when a query takes a long time:
class LongQueryLogMiddleware(object):
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
response = self.get_response(request)
for q in connection.queries:
if float(q['time']) >= settings.LONG_QUERY_TIME_SEC:
logger.warning("Found long query (%s sec): %s", q['time'], q['sql'])
return response
I've made a small gist with all the code. Sorry for the indentation, GitHub keeps removing the indentation.
In the code above I only log the query, but you can add request information that will help you identify where the query comes from.
I don't know Django, so I may be off the mark, but there's a simple trick I heard from one of the people that runs RDS:
Add an identifier to your query as a comment.
So, include a UUID, ID, label, etc. to the query
-- as a comment
and that will flow through to the log. This is an easy way to tie Postgres log entries to specific methods/scripts, it sounds like it would need a bit of adaptation to be useful in your case. (If the idea applies at all.)

How to get specific content type document count from alfresco share UI using Lucene query

How to get specific content type document count in alfresco share UI using Lucene query?
I have tried to query in alfresco share UI Alfresco Lucene query. but it's only giving first 100 results.
Is there any best way to get only document count by specific content type or document count under specific alfresco site??
Please suggest if there is any other best and useful way.
Thanks in Advance.
The class PatchDAO has a method that returns the number of node with a given type:
/**
* Gets the total number of nodes which match the given Type QName.
*
* #param typeQName the qname to search for
* #return count of nodes that match the typeQName
*/
public long getCountNodesWithTypId(QName typeQName);
where typeQName is, of course, the QName of the type.
This method should return the total count and should be the most efficient.
UPDATE:
If you need the count on a specific site this method is not actually usable.
ResultSet result = searchService.query(, SearchService.LANGUAGE_LUCENE, "+PATH:\"/app:company_home/cm:" + + "/*\"" + " +TYPE:\"" + + "\"" );
You can change the parameters as per your need.
Thanks,
Kintu
Hitting the database directly is a very bad idea, so don't even start getting into that bad habit.
Using the Alfresco foundational Java API would require that the Java class be deployed to the server, which is a pain.
The easiest way to do this is to use OpenCMIS. You can run OpenCMIS code remotely, and you can use its paging result set to page through the query results, see Apache CMIS: Paging query result

phalcon querybuilder total_items always returns 1

I make a query via createBuilder() and when executing it (getQuery()->execute()->toArray())
I got 10946 elements. I want to paginate it, so I pass it to:
$paginator = new \Phalcon\Paginator\Adapter\QueryBuilder(array(
"builder" => $builder,
"limit" => $limit,
"page" => $current_page
));
$limit is 25 and $current_page is 1, but when doing:
$paginator->getPaginate();
$page->total_items;
returns 1.
Is that a bug or am I missing something?
UPD: it seems like when counting items it uses created sql with limit. There is no difference what limit is, limit divided by items per page always equals 1. I might be mistaken.
UPD2: Colleague helped me to figure this out, the bug was in the query phalcon produces: count() of the group by counts grouped elements. So a workaround looks like:
$dataCount = $builder->getQuery()->execute()->count();
$page->next = $page->current + 1;
$page->before = $page->current - 1 > 0 ? $page->current - 1 : 1;
$page->total_items = $dataCount;
$page->total_pages = ceil($dataCount / 100);
$page->last = $page->total_pages;
I know this isn't much of an answer but this is most likely to be a bug. Great guys at Phalcon took on a massive job that is too big to do it properly in their little free time and things like PHQL, Volt and other big but non-core components do not receive as much attention as we'd like. Also given that most time in the past 6 months was spent on v2 there are nearly 500 bugs about stuff like that and it's counting. I came across considerable issues in ORM, Volt, Validation and Session, which in the end made me stick to other not as cool but more proven solutions. When v2 comes out I'm sure all attention will on the bug list and testing, until then we are mostly on our own. Given that it's all C right now, only a few enthusiast get involved, with v2 this will also change.
If this is the only problem you are hitting, the best approach is to update your query to get the information you need yourself without getPaginate().

Magento Bulk update attributes

I am missing the SQL out of this to Bulk update attributes by SKU/UPC.
Running EE1.10 FYI
I have all the rest of the code working but I"m not sure the who/what/why of
actually updating our attributes, and haven't been able to find them, my logic
is
Open a CSV and grab all skus and associated attrib into a 2d array
Parse the SKU into an entity_id
Take the entity_id and the attribute and run updates until finished
Take the rest of the day of since its Friday
Here's my (almost finished) code, I would GREATLY appreciate some help.
/**
* FUNCTION: updateAttrib
*
* REQS: $db_magento
* Session resource
*
* REQS: entity_id
* Product entity value
*
* REQS: $attrib
* Attribute to alter
*
*/
See my response for working production code. Hope this helps someone in the Magento community.
While this may technically work, the code you have written is just about the last way you should do this.
In Magento, you really should be using the models provided by the code and not write database queries on your own.
In your case, if you need to update attributes for 1 or many products, there is a way for you to do that very quickly (and pretty safely).
If you look in: /app/code/core/Mage/Adminhtml/controllers/Catalog/Product/Action/AttributeController.php you will find that this controller is dedicated to updating multiple products quickly.
If you look in the saveAction() function you will find the following line of code:
Mage::getSingleton('catalog/product_action')
->updateAttributes($this->_getHelper()->getProductIds(), $attributesData, $storeId);
This code is responsible for updating all the product IDs you want, only the changed attributes for any single store at a time.
The first parameter is basically an array of Product IDs. If you only want to update a single product, just put it in an array.
The second parameter is an array that contains the attributes you want to update for the given products. For example if you wanted to update price to $10 and weight to 5, you would pass the following array:
array('price' => 10.00, 'weight' => 5)
Then finally, the third and final attribute is the store ID you want these updates to happen to. Most likely this number will either be 1 or 0.
I would play around with this function call and use this instead of writing and maintaining your own database queries.
General Update Query will be like:
UPDATE
catalog_product_entity_[backend_type] cpex
SET
cpex.value = ?
WHERE cpex.attribute_id = ?
AND cpex.entity_id = ?
In order to find the [backend_type] associated with the attribute:
SELECT
  backend_type
FROM
  eav_attribute
WHERE entity_type_id =
  (SELECT
    entity_type_id
  FROM
    eav_entity_type
  WHERE entity_type_code = 'catalog_product')
AND attribute_id = ?
You can get more info from the following blog article:
http://www.blog.magepsycho.com/magento-eav-structure-role-of-eav_attributes-backend_type-field/
Hope this helps you.

MOSS 2007: What is the source of "Directories"?

I'm trying to generate a new SharePoint list item directly using SQL server. What's stopping me is damn tp_DirName column. I have no ideas how to create this value.
Just for instance, I have selected all tasks from AllUserData, and there are possible values for the column: 'MySite/Lists/Task', 'Lists/Task' and even 'MySite/Lists/List2'.
MySite is the FullUrl value from Webs table. I can obtain it. But what about 'Lists/Task' and '/Lists/List2'? Where they are stored?
If try to avoid SQL context, I can formulate it the following way: what is the object, that has such attribute as '/Lists/List2'? Where can I set it up in GUI?
Just a FYI. It is VERY not supported to try and write directly to SharePoint's SQL Tables. You should really try and write something that utilizes the SharePoint Object Model. Writing to the SharePoint database directly mean Microsoft will not support the environment.
I've discovered, that [AllDocs] table, in contrast to its title, contains information about "directories", that can be used to generate tp_DirName. At least, I've found "List2" and "Task" entries in [AllDocs].[tp_Leaf] column.
So the solution looks like this -- concatenate the following 2 components to get tp_DirName:
[Webs].[FullUrl] for the web, containing list, containing item.
[AllDocs].[tp_Leaf] for the list, containing item.
Concatenate the following 2 components to get tp_Leaf for an item:
(Item count in the list) + 1
'_.000'
Regards,
Well, my previous answer was not very useful, though it had a key to the magic. Now I have a really useful one.
Whatever they said, M$ is very liberal to the MOSS DB hackers. At least they provide the following documents:
http://msdn.microsoft.com/en-us/library/dd304112(PROT.13).aspx
http://msdn.microsoft.com/en-us/library/dd358577(v=PROT.13).aspx
Read? Then, you know that all folders are listed in the [AllDocs] table with '1' in the 'Type' column.
Now, let's look at 'tp_RootFolder' column in AllLists. It looks like a folder id, doesn't it? So, just SELECT the single row from the [AllDocs], where Id = tp_RootFolder and Type = 1. Then, concatenate DirName + LeafName, and you will know, what the 'tp_DirName' value for a newly generated item in the list should be. That looks like a solid rock solution.
Now about tp_LeafName for the new items. Before, I wrote that the answer is (Item count in the list) + 1 + '_.000', that corresponds to the following query:
DECLARE #itemscount int;
SELECT #itemscount = COUNT(*) FROM [dbo].[AllUserData] WHERE [tp_ListId] = '...my list id...';
INSERT INTO [AllUserData] (tp_LeafName, ...) VALUES(CAST(#itemscount + 1 AS NVARCHAR(255)) + '_.000', ...)
Thus, I have to say I'm not sure that it works always. For items - yes, but for docs... I'll inquire into the question. Leave a comment if you want to read a report.
Hehe, there is a stored procedure named proc_AddListItem. I was almost right. MS people do the same, but instead of (count + 1) they use just... tp_ID :)
Anyway, now I know THE SINGLE RIGHT answer: I have to call proc_AddListItem.
UPDATE: Don't forget to present the data from the [AllUserData] table as a new item in [AllDocs] (just insert id and leafname, see how SP does it itself).