Recently I encountered a need for two methods in my code:
"info siblings" "info root" -- similar to "info parent" but travels down the object tree to find the first object or root (of a tree OR a branch).
I realize that it is very trivial to script them in myself but I am dealing with thousands (if not millions in the future) of objects and would like to squeeze as much performance from the code as possible. Hence, I propose to add those methods in C code (to NSF), provided that coding them in C will indeed speed things up. Also, having those methods seems to be natural, just like having "info parent" or "info children".
Another proposal, (tongue-in-cheek, since it probably benefits only my code), is to add a timestamp (microseconds) to every object upon creation. I have a need to list the object (thousands, if not millions) in the order of creation. Again, it is trivial to script it in, but performance is an issue.
Thanks
Am 19.04.15 um 22:21 schrieb Victor Mayevski:
Recently I encountered a need for two methods in my code:
"info siblings" "info root" -- similar to "info parent" but travels down the object tree to find the first object or root (of a tree OR a branch).
I realize that it is very trivial to script them in myself but I am dealing with thousands (if not millions in the future) of objects and would like to squeeze as much performance from the code as possible. Hence, I propose to add those methods in C code (to NSF), provided that coding them in C will indeed speed things up. Also, having those methods seems to be natural, just like having "info parent" or "info children".
is this, what you have in mind? The questionable parts are the ones, where namespaces are used, which do not correspond to objects. i would not expect huge differences in performance when coding this in C.
================================================ package req nx::test
nx::Object public method "info root" {} { set parent [:info parent] if {![nsf::is object $parent]} { return [self] } return [$parent info root] }
nx::Object public method "info siblings" {} { set parent [:info parent] set self [self] set siblings {} foreach c [info commands ${parent}::*] { if {[nsf::is object $c] && $c ne $self} { lappend siblings $c } } return $siblings }
# # Some test cases # namespace eval ::test { nx::Object create o nx::Object create o::p nx::Object create o::p::q nx::Object create o::i nx::Object create o::j nx::Object create o2 nx::Object create o3
? {o info root} ::test::o ? {o::p::q info root} ::test::o
? {lsort [o info siblings]} {::test::o2 ::test::o3} ? {lsort [o::p info siblings]} {::test::o::i ::test::o::j} } ================================================
Another proposal, (tongue-in-cheek, since it probably benefits only my code), is to add a timestamp (microseconds) to every object upon creation. I have a need to list the object (thousands, if not millions) in the order of creation. Again, it is trivial to script it in, but performance is an issue.
By adding time stamps to the code, the size of every object will increase, which we really want to avoid.
How about the following approach. this is like an implementation of "new" which uses the creation time as object name. To avoid confusions with digit overflows, one needs in the general case a custom sort function, or a "format" for the name generation, but that should be pretty straight-forward.
# # Keep order of creation # nx::Class create C
for {set i 0} {$i < 100} {incr i} { C create [clock clicks -microseconds] } puts [lsort [C info instances]]
If you use huge trees, i am not sure that using nested tcl namespaces is memory-wise the best approach - but i do not know the requirements of your application, and the structure (width, depth, size) of the object trees.
Maybe it is better to use ordered composites to avoid high number of namespaces and huge object names like e.g.:
https://next-scripting.org/xowiki/file/docs/nx/examples/container.html?html-... http://openacs.org/api-doc/procs-file-view?path=packages%2fxotcl-core%2ftcl%...
nx/xotcl objects require much less memory when these are namespace-less (contain no children or per-object methods).
All the best -g
On Sun, Apr 19, 2015 at 2:25 PM, Gustaf Neumann neumann@wu.ac.at wrote:
Am 19.04.15 um 22:21 schrieb Victor Mayevski:
Recently I encountered a need for two methods in my code:
"info siblings" "info root" -- similar to "info parent" but travels down the object tree to find the first object or root (of a tree OR a branch).
I realize that it is very trivial to script them in myself but I am dealing with thousands (if not millions in the future) of objects and would like to squeeze as much performance from the code as possible. Hence, I propose to add those methods in C code (to NSF), provided that coding them in C will indeed speed things up. Also, having those methods seems to be natural, just like having "info parent" or "info children".
is this, what you have in mind? The questionable parts are the ones, where namespaces are used, which do not correspond to objects. i would not expect huge differences in performance when coding this in C.
================================================ package req nx::test
nx::Object public method "info root" {} { set parent [:info parent] if {![nsf::is object $parent]} { return [self] } return [$parent info root] }
nx::Object public method "info siblings" {} { set parent [:info parent] set self [self] set siblings {} foreach c [info commands ${parent}::*] { if {[nsf::is object $c] && $c ne $self} { lappend siblings $c } } return $siblings }
# # Some test cases # namespace eval ::test { nx::Object create o nx::Object create o::p nx::Object create o::p::q nx::Object create o::i nx::Object create o::j nx::Object create o2 nx::Object create o3
? {o info root} ::test::o ? {o::p::q info root} ::test::o ? {lsort [o info siblings]} {::test::o2 ::test::o3} ? {lsort [o::p info siblings]} {::test::o::i ::test::o::j}
}
Another proposal, (tongue-in-cheek, since it probably benefits only my
code), is to add a timestamp (microseconds) to every object upon creation. I have a need to list the object (thousands, if not millions) in the order of creation. Again, it is trivial to script it in, but performance is an issue.
By adding time stamps to the code, the size of every object will increase, which we really want to avoid.
How about the following approach. this is like an implementation of "new" which uses the creation time as object name. To avoid confusions with digit overflows, one needs in the general case a custom sort function, or a "format" for the name generation, but that should be pretty straight-forward.
# # Keep order of creation # nx::Class create C
for {set i 0} {$i < 100} {incr i} { C create [clock clicks -microseconds] } puts [lsort [C info instances]]
If you use huge trees, i am not sure that using nested tcl namespaces is memory-wise the best approach - but i do not know the requirements of your application, and the structure (width, depth, size) of the object trees.
Maybe it is better to use ordered composites to avoid high number of namespaces and huge object names like e.g.:
https://next-scripting.org/xowiki/file/docs/nx/examples/container.html?html-...
http://openacs.org/api-doc/procs-file-view?path=packages%2fxotcl-core%2ftcl%...
nx/xotcl objects require much less memory when these are namespace-less (contain no children or per-object methods).
I am trying to create a generic mechanism for ordered nested objects. Although it is true that objects with namespaces use more memory, I accept it as a trade off for flexibility I get. I really strive for a "natural" order of objects (after all that's what objects are supposed to do: be an analog for the real world). I am also trying to make it as unobtrusive as possible (without a tracking mechanism etc). The reason I wanted timestamps is that I could have naturally named objects, when needed.
Per your suggestions, I am thinking that instead of using timestamps for each object, use a hybrid name+timestamp ("-prefix name" option). Also [lsort [Class info instances]] seems to be the fastest way to list objects in order, instead of having to track each object. In addition to that I can also implement a caching mechanism to speed things up even further.
Thanks.
All the best -g
Am 23.04.15 um 19:08 schrieb Victor Mayevski:
nx/xotcl objects require much less memory when these are namespace-less (contain no children or per-object methods).
I am trying to create a generic mechanism for ordered nested objects. Although it is true that objects with namespaces use more memory, I accept it as a trade off for flexibility I get. I really strive for a "natural" order of objects (after all that's what objects are supposed to do: be an analog for the real world). I am also trying to make it as unobtrusive as possible (without a tracking mechanism etc). The reason I wanted timestamps is that I could have naturally named objects, when needed.
i did some tests creating large trees (1 mio nodes).
The performance and memory consumption depends on the degree of the tree. If one creates e.g. a tree of degree 16, then namespaced objects are actually the best, since only a small fraction of the nodes (the non-leaf nodes) have namespaces. The ordered composite requires some double bookkeeping: no matter, how the objects are created, these are children of a (maybe global namespace) and additionally, one has to maintain a list of the children of the node. It is true that the inner nodes require more memory, but one saves the extra bookkeeping.
This advantage vanishes when the degree of the tree is low. For binary tree, half of the nodes have namespaces, ("#ns" below). Time becomes worse, since in deep trees there are many inner nodes, which are checked for existence.
If you have huge tree, i would certainly recommend to make experiments, since the differences can be huge.
-g
Ordered Composite
degree #ns RSS micro secs per obj 2 0 1130 34.0 10 0 1010 14.0 16 0 1010 14.6
Namespaced objects
degree #ns RSS micro secs per obj 2 500,000 1020 57.7 10 100,000 725 17.7 16 62,500 716 15.5