yii caching and couchbase - yii

need some help to clarify the concept.
$sql = 'SELECT * FROM tbl_post LIMIT 20';
$dependency = new CDbCacheDependency('SELECT MAX(update_time) FROM tbl_post');
$rows = Yii::app()->db->cache(1000, $dependency)->createCommand($sql)->queryAll();
1.if the cache contains an entry indexed by the SQL statement.
2.if the dependency has not changed (the maximum update_time value is the same as when the query result was saved in the cache).
I do not understand what do the above explanation means. Especially second one with regards to maximum update_time. Please correct me if I am wrong.
There is a update_time column in tbl_post table. Whenever a row is updated, the update_time is updated too. If a post is retrieved from the cache, CDbCacheDependency will first query the database for MAX(update_time)? What is the purpose of this and how exactly does it works in keeping the cache updated?
Another question is regarding memcache. I understand that it is possible to cluster memcache servers. Say I have the below configurations.
1 memcache server in US. 1 memcache server in Europe.
My Yii website makes use of the cluster of 2 nodes. memcache will split the caching between the 2 nodes.
1.user A retrieves a post from database and cached it. assume (123,$model) in US node.
2.user B wants to retrieve the same post, from Europe. Will looking for key 123 finds the cache? Does it matters if both users are in US or Europe?
Thanks!!

After first run - DB component puts its result into cache. Also it puts there result of dependency-query (max update time in your case).
Then when your try to get data, db component executs dependency query and compare it with cached one. If dependency is unchanged (there is no new posts) it get query results from cache, in other case it execute`s query.

Related

How to know the first ran jobId of a cached query in big-query?

When we run a query in big-query environment, the results are cached in the temporary table. From next time onwards, when we ran the same query multiple times, the subsequent runs will fetch the results from the cache for the next 24 hrs with some exceptions. Now my use case is, in the subsequent runs, i want to know like from which jobId this query cache results are got, previous first time run of the query ??
I have checked all the java docs related to query didn't find that info. We have cacheHit variable, which will tell you whether the query has fetched from the cache or not . Here i want to know one step further, from what jobId, the results got fetched. I expected like, may be in this method i can know the info, but i am always getting the null value for that. I also want to know what is meant by parentJob in big-query context.
It's unclear why you'd even care about this other than as a technical exercise. If you want to build your own application caching layer that's a different concern. More details about query caching can be found on https://cloud.google.com/bigquery/docs/cached-results.
The easiest way to do this would probably be by traversing jobs.list until you find a job that has the same destination table (it'll be prefaced with an anon prefix), and where the cacheHit stat is false/not present.
Your inquiry about parentJob is unrelated to this exercise. It's for finding all the child jobs created as part of a script or multi-statement execution. More information about this can be found on https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting-concepts.

How to get multiple data from gemfire cacheloader?

We are going to implement gemfire for our project. We are currently syncing gemfire cache with our DB2 database. So, we are facing issue while putting DB data into cache.
To put DB data into region. I have implement com.gemstone.gemfire.cache.CacheLoader and override load method of it. As written in java doc load method will return only one Object. But for our requirement we will have to return multiple VO from load method
public List<CmDvceInvtrGemfireBean> load(LoaderHelper<CmDvceInvtrGemfireBean, CmDvceInvtrGemfireBean> helper)
throws CacheLoaderException
While returining multiple VO in form of List<CmDvceInvtrGemfireBean> gemfire region consider it's as single value.
So, when i invoke,
System.out.println("return COUNT" + cmDvceInvtrRecord.query("SELECT COUNT(*) FROM /cmDvceInvtrRecord"));
It return count of one. But i can see total 7 number of data into it.
So, I want to implement the kind of mechanism that will put all the 7 values as a separate VO in Region
Is there any way to do this using Gemfire CacheLoader?
A CacheLoader was meant to load a value only for a single entry in the GemFire Region on a cache miss. As the Javadoc states...
..creates the value for the desired key..
While a key can map to a multi-valued (e.g. an array/Collection) value, the CacheLoader can only populate a single entry.
You will have to resort to other means of populating the cache with multiple "entries" in a single operation.
Out of curiosity, why do you need (requirement?) to load multiple entries (from the DB) at once? Are you trying to minimize the number of round trips to the DB?
Also, what logic are you using to decide what VO from the DB will be loaded based on the information (i.e. key) provided in the CacheLoader?
For instance, are you somehow trying to predictably select values from the DB based on the CacheLoader key that would subsequently minimize cache misses on future Region.get(key) calls?
Sorry, I don't have a better answer for you right now, but answers to some of these questions may help me give you some ideas for alternatives.
Cheers,
John

SQL - When data are transfered

i need to get a large amount of data from a remote database. the idea is do a sort of pagination, like this
1 Select a first block of datas
SELECT * FROM TABLE LIMIT 1,10000
2 Process that block
while(mysql_fetch_array()...){
//do something
}
3 Get next block
and so on.
Assuming 10000 is an allowable dimension for my system, let us suppose i have 30000 records to get: i perform 3 call to remote system.
But my question is: when executing a select, the resultset is transmitted and than stored in some local part with the result that fetch is local, or result set is stored in remote system and records coming one by one at any fetch? Because if the real scenario is the second i don't perform 3 call, but 30000 call, and is not what i want.
I hope I explained, thanks for help
bye
First, it's highly recommended to utilize MySQLi or PDO instead of the deprecated mysql_* functions
http://php.net/manual/en/mysqlinfo.api.choosing.php
By default with the mysql and mysqli extensions, the entire result set is loaded into PHP's memory when executing the query, but this can be changed to load results on demand as rows are retrieved if needed or desired.
mysql
mysql_query() buffers the entire result set in PHP's memory
mysql_unbuffered_query() only retrieves data from the database as rows are requested
mysqli
mysqli::query()
The $resultmode parameter determines behaviour.
The default value of MYSQLI_STORE_RESULT causes the entire result set to be transfered to PHP's memory, but using MYSQLI_USE_RESULT will cause the rows to be retrieved as requested.
PDO by default will load data as needed when using PDO::query() or PDO::prepare() to execute the query and retrieving results with PDO::fetch().
To retrieve all data from the result set into a PHP array, you can use PDO::fetchAll()
Prepared statements can also use the PDO::MYSQL_ATTR_USE_BUFFERED_QUERY constant, though PDO::fetchALL() is recommended.
It's probably best to stick with the default behaviour and benchmark any changes to determine if they actually have any positive results; the overhead of transferring results individually may be minor, and other factors may be more important in determining the optimal method.
You would be performing 3 calls, not 30.000. That's for sure.
Each 10.000 results batch is rendered on the server (by performing each of the 3 queries). Your while iterates through a set of data that has already been returned by MySQL (that's why you don't have 30.000 queries).
That is assuming you would have something like this:
$res = mysql_query(...);
while ($row = mysql_fetch_array($res)) {
//do something with $row
}
Anything you do inside the while loop by making use of $row has to do with already-fetched data from your initial query.
Hope this answers your question.
according to the documentation here all the data is fetched to the server, then you go through it.
from the page:
Returns an array of strings that corresponds to the fetched row, or FALSE if there are no more rows.
In addition it seams this is deprecated so you might want to use something else that is suggested there.

Get Yii DB Cache id of saved record in cache table

I have a query like this:
$record=books::model()->cache(10000)->find('id=:id',array(":id"=>$id));
Yii uses CDbCache to save the result in the cache table
id expire value
-- ------ ------
My question is:
How to get the id of the cache record -this id is generated by yii - in
yii cache table?
There's no public method to get the cache ID, as it incorporates a lot of variables, you could if you really wanted to but it's really dark down in the depths of Yii.
Answer based on comments:
You could always set a last cached time on your record, which you can then use to calculate the time remaining. But this won't persist over page spans, because your caching you result and it won't get read again. So your options are:
Do the caching manually, and store the time against the object to be used for calculation
Or store the time for each ID pulled in the session, and use that to calculate it.
To Answer this question. By the way the Yii code is written, you can't get the ID/Key for the Cache which the YII generates internally.
If you look at the sourcecode
https://github.com/yiisoft/yii/blob/1.1.16/framework/db/CDbCommand.php
Method queryInternal
You'll realize that the cacheKey generation is the complex thing. Moreover this cacheKey will further be Hashed by MD5
Refer: http://www.yiiframework.com/doc/api/1.1/CCache#generateUniqueKey-detail
If you are using 'cache' as part of QueryBuilder, then we are giving the power to Yii to manage this cache by providing the Expire Duration and Dependency Query.
If you want to be in command/control of the Expire Duration and want to have the ID, then only way is to manually set the CacheID as suggested by the user
Paystey as belowt:
$dbResult = Yii::app()->cache->get('YOUR-KNOWN-KEY');
if($dbResult === false){
$dbResult = YOUR-QUERY;
Yii::app()->cache->set('YOUR-KNOWN-KEY, $dbResult);
}

neo4j count nodes performance on 200K nodes and 450K relations

We're developing an application based on neo4j and php with about 200k nodes, which every node has a property like type='user' or type='company' to denote a specific entity of our application. We need to get the count of all nodes of a specific type in the graph.
We created an index for every entity like users, companies which holds the nodes of that property. So inside users index resides 130K nodes, and the rest on companies.
With Cypher we quering like this.
START u=node:users('id:*')
RETURN count(u)
And the results are
Returned 1 row.Query took 4080ms
The Server is configured as default with a little tweaks, but 4 sec is too for our needs. Think that the database will grow in 1 month 20K, so we need this query performs very very much.
Is there any other way to do this, maybe with Gremlin, or with some other server plugin?
I'll cache those results, but I want to know if is possible to tweak this.
Thanks a lot and sorry for my poor english.
Finaly, using Gremlin instead of Cypher, I found the solution.
g.getRawGraph().index().forNodes('NAME_OF_USERS_INDEX').query(
new org.neo4j.index.lucene.QueryContext('*')
).size()
This method uses the lucene index to get "aproximate" rows.
Thanks again to all.
Mmh,
this is really about the performance of that Lucene index. If you just need this single query most of the time, why not update an integer with the total count on some node somewhere, and maybe update that together with the index insertions, for good measure run an update with the query above every night on it?
You could instead keep a property on a specific node up to date with the number of such nodes, where updates are done guarded by write locks:
Transaction tx = db.beginTx();
try {
...
...
tx.acquireWriteLock( countingNode );
countingNode.setProperty( "user_count",
((Integer)countingNode.getProperty( "user_count" ))+1 );
tx.success();
} finally {
tx.finish();
}
If you want the best performance, don't model your entity categories as properties on the node. In stead, do it like this :
company1-[:IS_ENTITY]->companyentity
Or if you are using 2.0
company1:COMPANY
The second would also allow you automatically update your index in a separate background thread by the way, imo one of the best new features of 2.0
The first method should also proof more efficient, since making a "hop" in general takes less time than reading a property from a node. It does however require you to create a separate index for the entities.
Your queries would look like this :
v2.0
MATCH company:COMPANY
RETURN count(company)
v1.9
START entity=node:entityindex(value='company')
MATCH company-[:IS_ENTITIY]->entity
RETURN count(company)