Temporal and spatial coherence

Temporal and spatial coherence - physics

What kind of results (is the interference pattern formed, is the total throughput intensity changed) I have in experiment about coherence if:
1)Rc more than R and t less than tc
2)Rc more than R and t more than tc
3)Rc less R and t less than tc
4)Rc less and t more than tc
where Rc and tc are the spatial and temporal coherence lengths of the beam (in this case a laser)

In all the four cases the Interference pattern is not formed.
For an interference to occur both the sources must be have spatial and temporal coherence.
Refer this link:
http://www.phys.ttu.edu/~huang24/Teaching/Phys2401/LectureNotes/Ch37A.pdf
http://www.phys.uconn.edu/~gibson/Notes/Section5_2/Sec5_2.htm
http://www.phys.uconn.edu/~gibson/Notes/Section5_3/Sec5_3.htm

Related

ST05 shows FLAG=N"X" in S/4HANA system. Why?

We are comparing performance between a classic R/3 and a S/4 system, and simple selects from standard function modules ( e.g. selecting records from an IDoc table ) look different in S/4.
The most interesting things are those:
The S/4 brings less performance, than R/3 ( with same amount of records stored in the db table )
When we see ( in this case in a FOR ALL ENTRIES A.K.A. FAE ) WHERE STATUS = 69 or FLAG ="X" inside the R/3, we see this prefixed with an N, like FLAG = N"X"....
I assume, this stands for negation, BUT the code says clearly EQUALS.
And because the performance is so bad compared to S/4, I assumed, the S/4 somehow sometimes cannot deal with FAE and one of the side-effects is, to negate the where clause on the fields of the FAE-related source-table...
What does this N stand for ?

FLAG=N"X"
It does not mean negation, it means the value is sent down to HANA as a Unicode hardcoded value (NCHAR).
S4 performance
This was not directly asked, but I think it is imporant to also answer. There can be several reasons why R3 is sometimes faster:
In R3 you have Oracle or DB2 (DB6) and the SELECT is using the perfect index and the data is in the cache
You are comparing SELECTs on a table that was transparent in R3, but is a compatibility view in S4, like MARC or ANLC
Your S4 hardware is slower than R3. It is quite common, new CPUs with very high core count run the individual cores slower than a decade ago. So the total throughput is much higher, but each individual report and transaction runs slower
In my experience these are the typical cases where S4 is slower than R3.

Time Complexity for Spark/Distributed Algorithms

If we have below time complexity
for some sequential algorithm, how can we express this time complexity for the same algorithm implemented in Spark (distributed version). Assuming that we have 1 master node and 3 worker nodes in the cluster?
Similarly, how can we express O(n^2) time complexity for Spark algorithm?
Moreover, how can we express Space Complexity in HDFS with replication factor 3?
Thanks in advance!!!!

Ignoring orchestration and communication time (which is often not the case, ex. in case of sorting the whole data, the operation cannot be just "split" on different partitions).
Let's make another convenient assumption: the data is perfectly partitioned among the 3 partitions: every node holds n/3 data.
This said, I think we can consider an O(n^2) algorithm as sum of three O((n/3) ^ 2) partial computations (hence a final O((n/3) ^ 2)). This goes similarly for any other complexity ( O(n^2 log n) will be O((n/3)^2 log(n/3)) ).
As for the replication factor in hadoop, given the assumptions above, since the operations will be executed in parallel among replicas (!= from partitions), the complexity will be the same as an execution of a single "replica".

Neo4j scalability and indexing

An argument in favor of graph dbms with native storage over relational dbms made by neo4j (also in the neo4j graph databases book) is that "index-free adjacency" is the most efficient means of processing data in a graph (due to the 'clustering' of the data/nodes in a graph-based model).
Based on some benchmarking I've performed, where 3 nodes are sequentially connected (A->B<-C) and given the id of A I'm querying for C, the scalability is clearly O(n) when testing the same query on databases with 1M, 5M, 10M and 20M nodes - which is reasonable (with my limited understanding) considering I am not limiting my query to 1 node only hence all nodes need to be checked for matching. HOWEVER, when I index the queried node property, the execution time for the same query, is relatively constant.
Figure shows execution time by database node size before and after indexing. Orange plot is O(N) reference line, while the blue plot is the observed execution times.
Based on these results I'm trying to figure out where the advantage of index-free adjacency comes in. Is this advantageous when querying with a limit of 1 for deep(er) links? E.g. depth of 4 in A->B->C->D->E, and querying for E given A. Because in this case we know that there is only one match for A (hence no need to brute force through all the other nodes not part of this sub-network).
As this is highly dependent on the query, I'm listing an example of the Cypher query below for reference (where I'm matching entity labeled node with id of 1, and returning the associated node (B in the above example) and the secondary-linked node (C in the above example)):
MATCH (:entity{id:1})-[:LINK]->(result_assoc:assoc)<-[:LINK]-(result_entity:entity) RETURN result_entity, result_assoc
UPDATE / ADDITIONAL INFORMATION
This source states: "The key message of index-free adjacency is, that the complexity to traverse the whole graph is O(n), where n is the number of nodes. In contrast, using any index will have complexity O(n log n).". This statement explains the O(n) results before indexing. I guess the O(1) performance after indexing is identical to a hash list performance(?). Not sure why using any other index the complexity is O(n log n) if even using a hash list the worst case is O(n).

From my understanding, the index-free aspect is only pertinent for adjacent nodes (that's why it's called index-free adjacency). What your plots are demonstrating, is that when you find A, the additional time to find C is negligible, and the question of whether to use an index or not, is only to find the initial queried node A.
To find A without an index it takes O(n), because it has to scan through all the nodes in the database, but with an index, it's effectively like a hashlist, and takes O(1) (no clue why the book says O(n log n) either).
Beyond that, finding the adjacent nodes are not that hard for Neo4j, because they are linked to A, whereas in RM the linkage is not as explicit - thus a join, which is expensive, and then scan/filter is required. So to truly see the advantage, one should compare the performance of graph DBs and RM DBs, by varying the depth of the relations/links. It would also be interesting to see how a query would perform when the neighbours of the entity nodes are increased (ie., the graph becomes denser) - does Neo4j rely on the graphs never being too dense? Otherwise the problem of looking through the neighbours to find the right one repeats itself.

Slow spatial comparison when using cross join

I'm using U-SQL to select all objects which are inside one or more of the shapes. The code works but is really slow. Is there some way to make it more performant?
#rs1 =
SELECT DISTINCT aisdata.imo,
portshape.unlocode
FROM #lsaisdata AS aisdata
CROSS JOIN
#portsshape AS portshape
WHERE Geometry.STMPolyFromText(new SqlChars(portshape.shape.Replace("Z", "").ToCharArray()), 0).STContains(Geometry.Point(aisdata.lon, aisdata.lat, 0)).IsTrue;
Added more information about my issue:
I've registered Microsoft.SqlServer.Types.dll and SqlServerSpatial130.dll to be able to use spatial functions in U-SQL
I'm running my job in Data Lake Analytics using two AUs. Initially I used 10 AUs, but the Diagnostics tab stated that the job was 8 AUs over-allocated and max useful AUs was 2.
The job takes about 27 minutes to run with the UDT code below and the cross join takes almost all of this time
The input is one csv file (66 Mb) and one wkt file (2.4 Mb)
I'm using Visual Studio 2015 with Azure Data Lake Tools v2.2.5000.0
I tried encapsulating some of the spatial code in UDTs and that improved the performance to 27 minutes:
#rs1 =
SELECT DISTINCT aisdata.imo,
portshape.unlocode
FROM #lsaisdata AS aisdata
CROSS JOIN
#portsshape AS portshape
WHERE portshape.geoShape.GeoObject.STContains(SpatialUSQLApp.CustomFunctions.GetGeoPoint(aisdata.lon, aisdata.lat).GeoObject).IsTrue;

First, a CROSS JOIN will always explode your data to an NxM matrix. Depending on the number of rows this may either make it very expensive and possibly hard to estimate correct degree of parallelism.
Secondly, I assume that the Spatial join you do is an expensive operation. For example, if you use SQL Server 2012's spatial capabilities (2016 has native implementations of the type that may be a bit faster), I assume you probably get similar performance behavior. Most of the time you need a spatial index to get better performance. Now U-SQL does not support spatial indices, but you probably could approximate the same behavior by using an abstraction (like tessellation of the objects and determining if they overlap), to provide a faster pre-filter/join before you then test the condition to weed out the false positives.

Datastax consistency

We've installed Datastax on five nodes with search enabled on the five nodes and replication factor of 3. After adding 590 rows to a table and select from node 1 it retrieve 590. And when selecting from other nodes the number varies from 570 to 585 rows.
I tried using CONSISTENCY QUORUM on cqlsh, but nothing changed. And solr_query is not supported on CONSISTENCY QUORUM.
Is there a way to assure all data written to Cassandra is relieved as it is?

As LHWizard mentioned, if you use Consistency levels such that (nodes_written + nodes_read) > RF, you will ensure immediate consistency.
In your case, you can try using a CONSISTENCY ALL on your read so that all nodes are checked before returning (this will be immediately consistent even with write CL of ONE). This should actually trigger a read repair on the inconsistent nodes and the missing data will be streamed to those nodes.
You're right that solr queries can only be read at CL ONE. If you need higher consistency requirements, you will need to raise the CL for the writes to achieve what you need.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Temporal and spatial coherence - physics

Related

ST05 shows FLAG=N"X" in S/4HANA system. Why?

Time Complexity for Spark/Distributed Algorithms

Neo4j scalability and indexing

Slow spatial comparison when using cross join

Datastax consistency

Categories

Resources