Author: Josip Zohil, Koper, Slovenia, april 2013
Visual Fox Pro (VFP) has a built-in synchronization mechanism which refreshes data on distributed nodes/clients/servers. In this article the author shall present VFP processes added to a VFP application to create a local server with replicated data. This server is at the same time a client and a server. You can use this local server to improve VFP performance, add messaging and peer to peer capabilities to your application.
»Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency« (SQL Server Replication). Replication servers are usually deployed to improve performance by adding parallel processing power.
When you run a VFP application on a Local Area Network (LAN) as a multiuser (client/server) system, you can look at it as a distributed system with the following characteristics:
- Computers are connected;
- The computational workload of the system is distributed between the server and the clients (nodes). Data is manipulated on the nodes (distributed computation). When you run a VFP application with smaller tables, the performance is acceptable, but the network transport and refresh of larger tables shall suffer;
- VFP synchronization service can refresh multiple data copies (buffered data) on distributed nodes (clients/workstations).
Example. Run this program on two clients/workstations of VFP application:
Set refresh to 1,-1
Use employee
Browse
Change a field’s value in the Browse window on the first commuter and then skip to another record. On the second workstation you see the changed value. Behind the scene VFP synchronize the local data on both workstations.
- There is no inter process communication between the two clients processes (except the synchronization (refresh), possible table locking and some other functionality). Also, on multi processors/core computers each client process runs as a single process (no public or similar variables pass values between processes). The VFP applications (processes) share disks, database data and no other memory.
- There is only one server (file server).
Note. A network node is a server, a workstation/client, a web service or some other service connected, such as a socket communication and a COM object.
On a LAN you run multiuser VFP applications using the client/server model. You can attach connection from other systems and connect a VFP client to other systems. The model is the same: single process execution, without the parallel exploitations of free network resources. Also, in case your application is connecting at the same time to two different VFP databases, you communicate serially with them.
There are various ways of paralleling resources in order to improve system performances. In this article I shall extend the VFP client-server system by adding it replica capabilities at the client side. A primary motivation for replicating data in a distribute computation-based solution is to cut the network latency of data access by keeping data close to the application that use it, thereby improving the response time of this application. On a workstation I am going to create a local database. It is a client that pulls the data from the VFP server (master). At the same time it is a server. It pulls the data from the master server and serves this data to other clients (Napster style: a node (peer, client) offers its files to other peers).
Let us consider a problem from the production environment. We have a large data table »dnevnal« (150 MB, 500,000 records) with customer documents. It is not normalized. Each customer document holds information about the document (number-syd, date), customers (code, name-naziv and addresses) and the products sold (code, quantity, price). It has also two fields »syd« (in it is the document’s number) and id – auto increment number. With the query »select distinct syd, naziv from dnevnal into cursor xx order by syd« (A)
we get a list of all document numbers and costumers names. We bind the query results to the combo box in which the user selects a document. When we run this query on a client computer, it executes in two minutes and more.
VFP has to transport a large amount of data over the network from the server to the client. The table “dnevnal” is large (500,000 records). When we run the query (A) on the server computer it is executed in one second, inside a virtual machine on the server computer it is executed in 30 seconds, on the wireless connected workstation in 180 seconds. Can we approach the servers’ query speed on the workstation? Can we do this using free computer resources and programming techniques?
Our queries are executed fast on the server computer, but the slow network suffers, so the network data transport is the bottleneck: the query has to transport 150 MB and aggregate them to get a small cursor (0.5 MB). The network (transport) is our most loaded resource; it is a bottleneck, so we shall distribute some of its load to the local computer.
You can improve the query speed in various ways:
1) add better hardware;
2) normalize the table “dnevnal” and navigate with smaller normalized data tables. You can look at the table normalization as a kind of distribution, with a very limited number of distribution parts (mostly two or three). It is difficult to partition the normalized tables; they have limited possibilities to scale. Using the normalization technique you can reduce the problem, but you do not solve it: also, the normalized tables can increase to 150 MB and more;
3) use the VFP synchronization (refresh) capabilities, and thus add to VFP more distribution and parallel processes. You can distribute the data on the network, execute the queries in parallel and synchronize the distributed parts.
You can add a parallel VFP server to the workstation: a local database and a local service (process/agent) that:
- Watches the server (master) database events (insert and update triggers);
- Maintains the local data (tables and indexes) and synchronize them with the servers. It refreshes the local data periodically and/or on demand. The downloaded refreshed data are chunks of the table “dnevnal” data;
- The clients read the data from this local server (database).
The process creates a local database and synchronizes (refreshes) its data with the master server’s database. We shall use two refresh strategies:
1) Updating local data on master server’s changes (asynchronous process);
2) Updating them on demand, when the client needs them (synchronous process).
Refreshing local data means fetching chunks of masters’ and updating/inserting them in the local (replica) data base. We suppose that most data changes are at the dnevnal end (last records). We have to fetch only a set of records at the tables end. There are also the inserted records. The table has an auto increment (unique) field. For example, if we make changes in the record with the auto increment (AI) number 200, we expect there are probably changes in the records with AI greater than 200, so we fetch data from records with AI greater or equal to 200. There are no deleted records in the table.
The resource table is the infrastructure of the messages between the master server, local servers and client applications. There are three users of this table: a master server, the local services and a VFP application. In the application’s database add a table “resource” with this field: mval L, cluster L, hb L, node C(80), filePath C(80), lastid I. You will store system resources data (their states) in it.
Each node (workstation, virtual machine) has a record in this table. Its fields (columns) are:
We add an insert and update trigger code as two stored procedures to the database table dnevnal:
* Insert Trigger:
FUNCTION inserttrig () && dnevnal.id autoincreament number
UPDATE resource SET ;
cluster = .t. , hb=.f., lastid= IIF(lastid>dnevnal.id, dnevnal.id, lastid)
RETURN .T.
* Update Trigger:
FUNCTION updatetrig ()
UPDATE resource SET cluster = .t. , hb=.f.,;
lastid= IIF(lastid>dnevnal.id, dnevnal.id, lastid)
RETURN .T.
On each SQL insert/update of the table dnevnal, a trigger event is fired (inserttrig/ updatetrig function) and it updates the table »resource« (the columns cluster and hb are set to true). VFP propagates the values in »cluster« and »hb« to all nodes. The workstation local service catches the message. I shall describe this mechanism in the next section.
VFP has a built-in synchronization (refresh) service so it is simple to add new distribution capabilities: a workstation (local) replica server.
Let us describe this program:
*cluster.prg
*config parameters
lc1="c:\temp\local" && path to the local dbc
?lc1
lctable="d:\moj\xxx\data\dnevnal" && or network path
lcname="dnevnal" && table name
idmax=999999999 && used to control changes no the master server (a kind of global variable)
Set Procedure To mesproc
Set Procedure To clusterproc Additive && dnevnal
Init() && initialize app
openres(lc1) && open table resource (message table)
?lctable
createloc(lctable,lc1) &&create local dbc, otherwise open it
Select Local && local table
Go Bott
m.lastLocId=Id && increament number (state parameter)
Do While .T. && loop forever, simulate a tail recursive function
*SELECT resource
checknew() &&refresh local data (catch a new message)
Select Resource
* read a message
DO case
CASE resource.stop && received a message stop
CLOSE DATABASES all
quit
case Resource.mval Or Resource.cluster && in eider case refresh local data
*start refresh (update and insert) process
Select Resource
m.lastid=Resource.lastid
initupdate(idmax) && from clusterproc
t1=Seconds()
IF used(lcname)
Select (lcname) && SELECT dnevnal , index on id
ELSE
SELECT 0
USE (lctable)
endif
Go Bott
m.incnostart=Id
?"Last inc.no. in dnevnal",m.incnostart,"Last local inc.no.:",m.lastid
t1=Seconds()
*dowload the "old" records to update the local data
* Improve the protocol: if no changes, throw away the client's update
updateloc(lctable,m.lastid,m.incnostart) && local and remote inc.no.
? "Elapsed time update:",Second()-t1
t1=Seconds()
*pull the "new" records to append to local data
insertloc(m.lastLocId) && from clusterproc
? "Elapsed time insert:",Second()-t1
sendMesDone() && update state and send (publish) a message local update is done
* Select Local
&& end of refresh (update, insert) process
endcase
Wait Window "Pres a key or S to stop." To lres Timeout 0.2
If Upper(lres)=="S"
*Return
sendMesStop() && send a message to stop
Endif
* update state (heartbeat) (publish a message: I am alive),
sendMesHb() && sended after sendmesUpdate
Enddo &&loop
Return
This program creates a local replica database (if it does not exist), otherwise it refreshes its data with the masters. It inserts in the resource table a new record with a node name/key (sys(0)). This program is also an event manager built on top of the resource table. It has to be executed in the background (not visible) (or visible in a test environment), with a refresher parameter set to 1. In the resource table it finds the workstation’s record (field node) and read the nodes’ states from it. If the records value “mval” or “cluster” is true, it downloads a chunk of data from the server and updates the local server. On the server (master) event a value (cluster=.T.) is written in the resource table, the local service is listening to this value (message) and after that pulls the data from the master.
The cluster process pulls the data in two steps. It starts, fetches from the master server the updated records and after that the inserted ones (updateloc). The update is executed on the rows that already exist in the local table. The insert is executed on the rows that don't exist in the local table (insertloc). At the end of program it writes a new state (mval=.F., cluster=.F.) in the node's resource record, waits for a second in the background and then loops forever.
At the time »window« from the start to the end of the update process, there can be changes on the master server. In such case lastid is less than idmax (resource.lastid<idmax) and we leave lastid unchanged (the process will refresh this data in the next cycle).
The procedures of the program cluster are documented:
1) *clusterproc.prg
Function createloc(lctable,lc1) && path to the local dbc
t1=Seconds()
idmax=999999999 && used to control changes no the master server (a kind of global variable)
?lctable
Select 0
Use dnevnal && cluster.exe is in the application root directory
Set Order To incno && field id - autoincrement
If Not File(lc1+".dbc") && at first start create the database
Create Database (lc1)
?lctable
Select * From (lctable) Into Table (lc1) Order By Id &&remote table
Brow
Use
Open Database (lc1)
Add Table (lc1)
Use (lc1) Excl
Index On Id Tag Id && index on autoincreament
Index On syd Tag syd && index on document number
Use
Close Databases
Endif
Select 0
Use (lc1) Alias Local
Endfunc
Function initupdate(idmax)
Select Resource
Locate For Trim(Resource.Node)==Trim(Sys(0))
Replace lastid With idmax,stop With .F. && in case of server changes this value will be changed (for local use)
ENDFUNC
Function updateloc(lctable,lclastid,lclastlocid)
lcc=Sys(2015)
*dowload the "old" records to update the local data
* Improve the protocol: if no changes, throw away the client's update
Select * From (lctable) nofilter Into Cursor (lcc) Where Id>=lclastid And Id<=lclastlocid
Go Bott
?"Number of downloaded records:", Recno()
Go Top
*update the local file with new records
Do While Not Eof()
Select (lcc)
Scatter Name m.orig
m.id=Id
Select Local
Seek m.id Order Tag Id
If Eof()
Append Blank
Else
Endif
Gather Name m.orig
Select (lcc)
Skip
Enddo
Select (lcc)
Use
Endfunc && updateloc
Function insertloc(lclastlocid)
*pull the "new" records to append to local data
lcc=Sys(2015)
Select * From dnevnal nofilter Where dnevnal.Id>=lclastlocid Into Cursor (lcc)
Select Local
Append From Dbf(lcc)
Go Bott
?"Number of total records:", Recno()
Select (lcc)
Use
Endfunc && insertloc
2) *mesproc.prg
function init
application.Visible =.t.
SET REFRESH TO 1,0.1 && you can change this parameters
*?SYS(3054,11)
SET SAFETY OFF
SET STATUS off
SET TALK off
SET MULTILOCKS ON
SET ENGINEBEHAVIOR 70
CLOSE DATABASES all
SET EXCLUSIVE off
ACTIVATE screen
***lsecno=100 && max number of possible (not catched) batch insert
ENDFUNC
function openres(lc1) && path to the local dbc
SELECT 0
USE resource
LOCATE FOR TRIM(node)==TRIM(SYS(0)) && workstations' (nodes') name
IF NOT FOUND()
INSERT INTO resource (node,mval,filepath,stop) VALUES (SYS(0),.f.,lc1,.f.) &&publish a message
ELSE
* mval=.t. - refresh local data, publish a message
UPDATE resource SET resource.mval = .t. , filepath = lc1, stop=.f. WHERE TRIM(resource.node)==TRIM(SYS(0))
ENDIF
ENDFUNC &&openres
FUNCTION checkNew()
SELECT resource
LOCATE FOR TRIM(resource.node)==TRIM(SYS(0))
RLOCK() && refresh the records' data
UNLOCK
ENDFUNC
FUNCTION sendMesDone()
SELECT resource && publish a message the update is done
*If server data was updated in this period: resource.lastid<idmax
UPDATE resource SET resource.mval = .f. , resource.lastid = IIF(resource.lastid=m.lastid,m.lastid, m.incnostart),;
resource.cluster = .f. WHERE TRIM(resource.node)==TRIM(SYS(0))
ENDFUNC &&sendMesDone
FUNCTION sendMesHb() && update the heartbeat (publish a message: I am alive)
UPDATE resource SET resource.hb = .t. WHERE TRIM(resource.node)==TRIM(SYS(0))
endfunc
FUNCTION sendMesStop() && stop the service
UPDATE resource SET resource.stop = .t. WHERE TRIM(resource.node)==TRIM(SYS(0))
endfunc
The presented program is a simple VFP procedure; the resource file and its management are specific. When we or a VFP statement make a change in the record, every node notices the changes and their local processes automatically refresh the local data. After reading a record in the table “resource”, VFP application can manipulate these values:
- A local service “heartbeat” (field hb) tell us whether the service is working or is down;
- The field “cluster” tell us whether the table dnevnal is updated;
- In “lastid” there is the smallest auto increment number of the set of records to be downloaded;
- When VFP application sets mval to true, it sends a message to the local service and after receiving it (reading it) in less than a second, the service refreshes the local data (pulls from the master server the possible changes).
- The value in the field node is the workstation “key”.
- There is also a path to the local data (filePath).
A node application (server/client/service) writes in the table resource a node state and/or a message. VFP propagates this data to all nodes. On every node you can see the state of all running services. You make a lock on a node’s record and you can read and write (manipulate) its state. For example, if on Node1 we look at the record with a name “Node2” and its hb field is true, the service on Node2 is running; we see also the last increment number in its local table.
Note. You have to manipulate data using SQL commands (they lock, fire events and propagate data).
On the computer »server« in the root application folder create a project »cluster«, add it to the program »cluster.prg«, the database table »dnevnal« and the “resource” table. After building the project, you get “cluster.exe”. Run this VFP application from the computer »Cluster1«. At first start, it downloads the data in 180 seconds on my wireless connected client (30 seconds on a virtual machine client). It starts a replica server, in loops refreshes the local data (c:\temp\local.dbc), stops executing and waits for a second. It must also manipulate messages. It runs in parallel with other workstation applications.
The computer »Cluster1« becomes a new local client/server with a local data copy (replica) (c:\temp\local.dbf) of the master server’s »dnevnal«. It connects to both databases, reads from the source, and writes to the target.
Let us illustrate how to use the replica server within your application and how to solve the problem posted in this article.
Suppose you have a VFP application Test with a large database table Dnevnal on the master server (computer). Add a VFP form to this application; add to it a grid with a name Grid1 and a combo box with a name Combo1. In the form’s Data Environment add a table dnevnal and in its OpenTable method add this program:
* OpenTable
SET REFRESH TO 1,1
SET EXCLUSIVE OFF
SYS(3054,11)
USE resource
LOCATE FOR TRIM(node)==TRIM(SYS(0))
IF NOT FOUND()
MESSAGEBOX("Error, Local service not found!")
return
ENDIF
RLOCK() && catch the message
unlock
IF NOT hb
MESSAGEBOX("Error, Local service not running!")
return
ENDIF
RLOCK()
unlock
IF not resource.mval
* publish a message to update the local data
UPDATE resource SET resource.mval = .t. WHERE TRIM(resource.node)==TRIM(SYS(0))
* runs in parallel with the queries in qluster.exe
Endif
This program monitors whether the local service is running (hb=.T., node service is present). It puts a mval value to true and in this way it pushes (sends) a message to the local service to refresh local data. The local service is executed in parallel with the form’s initial event. After sending a message, the form initialization continues.
In the form’s init method add this program:
*Init
SET EXCLUSIVE off
*Cluster.exe is running in parallel and is downloading the eventual changes from the server
t1=SECONDS()
lcerr=.f.
ACTIVATE screen
sele resource
LOCATE FOR TRIM(node)==TRIM(SYS(0)) && find the nodes' record
IF NOT FOUND()
MESSAGEBOX("Error, Node service not found!")
lcerr=.t.
ENDIF
RLOCK() && catch the message
unlock
IF NOT hb && hart beat if false, service is not running
MESSAGEBOX("Error, Node service not running!")
lcerr=.t.
ENDIF
lcloc=resource.filepath
IF NOT lcerr
ii=0
DO WHILE resource.mval && wait until cluster.exe refresh local data
RLOCK() &&refresh - catch the message
UNLOCK
SELECT dnevnal
WAIT TIMEOUT 0.1
*UPDATE dnevnal SET koli=koli+1 WHERE id= 812325
ii=ii+1
IF ii=100 && stop waiting, there is an error, this is a timeout
lcerr=.t.
exit
endif
ENDDO
?"Till now this program is running in parallel with Cluster.exe",SECONDS()-t1
endif
lccus=SYS(2015)
* The next query is not optimized. You can improve speed by saving aggregate data in local storage.
IF lcerr && if problems with local service, download data from server
lcloc="dnevnal" &&load balancing switch : if local server is in trouble, pull data from the master
ENDIF
?"End opening tables.",SECONDS()-t1
t1=SECONDS()
SELECT distinct syd,naziv FROM (lcloc) INTO CURSOR (lccus) ORDER BY syd
*the cursor lccus dataset is a normalized version of the original data
?"End query local data.",SECONDS()-t1
thisform.combo1.RowSource = (lccus)
thisform.combo1.Refresh ()
This program continues the process started in the form’s OpenTables event. It tests whether the service is running, after that it stops executing till a resource.mval value becomes false: the parallel program cluster stops refreshing local data. When the parallel service stops, the program continues its execution. If everything is ok, the program runs a query from a replica (local) database, or otherwise it switches the query (slow execution) to the master’s database. This program is smart: If the local server is down, it switches to the master server. The program pulls the data from two servers (the master or replica-local).
Note. Here you can insert other server load-balancing methods.
The Combo1 InteractivChange event has this program:
*InteractiveChange
lcg=SYS(2015)
ACTIVATE screen
SYS(3054,11)
t1=SECONDS()
lcd=VAL(this.Value) && documents' number
SELECT * FROM dnevnal INTO CURSOR (lcg) WHERE syd==lcd && download a document
?"Elapsed time to download a document:",SECONDS()-t1
Select (thisform.grid1.RecordSource)
use
thisform.grid1.Visible = .T.
thisform.grid1.RecordSource=(lcg)
Compile the project Test and run it from the workstation computer »cluster1«. On my wireless connected client it executes in 4 seconds, about 45 times faster (180/4). It runs in parallel with the replica server service Cluster.exe.
If you use this technique, your queries could run faster. You have to change the programs from this article:
In a normal VFP application you execute code in a sequential way. There is a limited need of events.
When you execute processes in parallel, events become a precious tool. Parallel processes are workers with limited knowledge. They work together, so they need coordination (management). Events and messages can help.
Let us describe the resource table and its data as a component of an event system. We have three event sources (the master, the client and the local service) and two listeners (the client and the local service). When the update (insert) event fires the dnevnal, the update/insert trigger raise an event: put the value .T. in the field “cluster” in all records of the table resource (It sends a request to all nodes). The listeners of this event are local services (cluster.exe). They read this value (catch the master message), refresh the local servers’ data and raise a new event (end of local data update). We carry out the publishing of this event by updating the resource table record: cluster=.F., mval=.F., hb=.T. and the last local AI (increment number). Publishing (raising) this event means writing a new nodes’ (local) state in the resources’ record.
The Test application is an event publisher: it writes mval=.T. in the record, the local service (Cluster.exe), a listener (subscriber), catches this value, refreshes the local data and publishes a new state. The Test application is also a listener: it watches the field mval (in the forms init event). When mval is false (end of local data update) the form initi event continues its execution.
Using the locking system and events the VFP programs coordinate three types of parallel processes: the master, the client and the replica server.
The resource table is a container for public variables (values). All its users can manipulate this data.
“A peer-to-peer” (abbreviated to P2P) computer network is one in which each computer in the network can act as a client or server for the other computers in the network, allowing shared access to various resources…” (Wikipedia). The table resource and the node service (Cluster) are component of a peer-to-peer system with a central service. Each peer (node) has a server and client functionality.
A peer-to-peer system has:
Because the message path is always through a central node (master server), we name this a central service.
Peers (nodes, local services) have all similar capabilities; they are autonomous and can provide services to, or interact with each other. The workstations (and the master server) on which you run Cluster.exe has client and server capabilities and provided ways to control the divergence of replica contents. Your application (a client) can use these clients/servers.
When you run the »cluster.exe« on the workstation’s computer, you extend the network capabilities using the VFP distribution style:
· The VFP application has an important computation role in the network; it takes some load from the server and do some job on the workstation (such as running a query);
· You add it more parallel capabilities by adding it a replica server; it takes load from the server and the network. The local service runs asynchronously in parallel with your application.
You start »cluster.exe« in a fire and forget way (asynchronously). It adds to the workstation a replica server: local service, database and a table (c:\temp\local.dbf). When you run your basic application (in our case »Test.exe«), you run it in parallel with the local replica service (cluster.exe). The two data sessions run in parallel.
Attention, the two applications interact! From the parallel (concurrent) point of view this is an unsafe cooperation; maybe the second application can change (destruct) the first application’s data or vice versa.
With the application “cluster.exe” we increase the number of applications in our network. In the overall system we increase the chance of errors. Let us analyze some error possibilities:
1) The first application (cluster.exe) does not start or crashes. The second application (Test.exe) will be executed correctly and much slower. You can remove this inconvenience by restarting the first. When the Cluster application restarts it reads the minimal id (increment number) of the last updated record (to identify the potentially missing updates) and refreshes the local database.
Note. A process restart capabilities are its added value (it is state independent).
2) The first application is simple and has no statements that can affect the second application data. In a production environment the first application is not visible, so there is no user intervention in it. The second application (Test) statements can affect the first application data only using the table resource and dnevnal.
Each of the two applications has its proper data session: they share the tables “dnevnal”, “local” and “resource”. The write/read/refresh operations in these parallel data sessions obey the VFP locking and refresh rules. The programmer responsibilities are to update only the master data.
Let us analyze two totally different processes, for example:
1) Inside the Test application run the form two times. You run them sequentially: they access the value mval sequentially (one after the other).
2) Run the application Test two times on the same workstation and in each of them you run a form. The two forms run in parallel, they share the resource mval. If you change this value in one of the form, you can affect the other form execution and this can be a problem.
If you watch the two applications in Windows Task Manager, each amount of used memory is upper bounded with 250 MB. When it stops loading data, the RAM memory cuts down to 17 MB. The local VFP service can become critical in memory consumption only if you run in parallel multiple VFP applications with large data tables. The parallel process Cluster.exe consumes a small part of the processor and RAMs resources.
You can put more tables in a local database; theoretically all the database tables. The cost of adding smaller tables to the replica server is high, as the VFP engine has to refresh multiple tables on multiple replica servers (nodes). VFP transports smaller tables at a reasonable speed; it is better to hold them on the server. Normally a VFP application has only few large tables that are worth of being replicated locally and run the queries in parallel. You would do this judiciously as the overhead for putting many tables in replica can cut the benefits.
You can refresh many local tables in the same exe (for example, in »cluster«). In my opinion a better way is to create a separate service (similar to »cluster«) and execute in parallel a local service per local table. They can run in parallel. In case a service crash, you have to restart only this one. The processes are expensive but the most expensive is their serial execution!
In the experiments of this article we have used the local replica as a read only storage. Our problem has been to improve the data download. The local database can receive changes from the server, but must not accept local changes.
Theoretically a client can make changes to local data. In this case you have to solve a relatively difficult problem: how to synchronize local data changes with the server (bidirectional). For example, your local servers' data can be stale data. When you update the masters' (not stale) data, you have to resolve the conflict between the concurrent servers fresh and clients stale data.
The local replica is a backup of the table dnevnal. In our example it is at most 1 second stale. With our original query (A) we have 180 seconds stale data, in the solution with a local service only 4 seconds (1 second to pull data from the master server and 3 seconds to query local data). We reduce the chance of unseen writes in our combo box for 45 times (180/4).
The code in this article is not optimized. My goal has been to prove that you can run VFP applications faster by using simple (not optimized) parallel code.
In our example we replicate all the data from the table dnevnal. What we really need locally is a follow_up »normalized« subset of the masters' data described by this query
»select distinct syd,naziv,id from dnevnal order by syd« (B)
(pull document number and customer name). This is a smaller data set. You can download, refresh and query this data faster. It is enough, if your replica has this data set.
The query (B) is similar to a read only VFP local view.
In our example we have watched:
- Two server resources: table change events (field cluster) and the smallest autoincrement number of the set of modified records;
- Four local resources: local data has (not) been refreshed; the highest auto increment number in the local table; the path to the local database; the service heartbeat (its presence);
You can add extra fields to the resource table and watch more local and/or server resources. For example, the network path to the local database (NP), such as »\\P2\c\temp\local«.Your local services can become the »watch dog« applications of your network resources. All peers can use this »watchdog« service. On all peers (nodes) you can catch messages from other peers: browse the resource table. You have a list of present peers. For example, on the node (peer) P1 you can see, if the local service on node (peer) P2 is present (its hb is true), if his data is updated (its cluster value is .F.), You can read its network path (NP) and pull data from the peer P2, such as »use »\\P2\c\temp\local«.
You can put in the table resource other local data: active processes, opened files, file system information ....
We have a network with multiple servers, a resource table and a synchronization (refresh) process. You can use remote services on it in various ways, let us illustrate an example.
On Node2 we can start the refresh process on Node1: in the nodes’ (Node1) record of the resource table we replace mval with .T. (and skip to another record). When we put mval to true, we send a message to the node “Node1”, more precisely, we put this message in the “Node1” message queue. The service on “Node1” will read this message from its message queue and execute the process on computer “Node1”. In our example, it will start the refresh process defined in the program Cluster.
In this article I have distributed content (data), but we can also distribute the execution (computation).
Our message queue has a limited length (1). Let us illustrate how to enlarge it. Create a new table “mesqueue” (Messages queue) with this field: Sender C(80), Node C(80), Message C(256),mesId C(10), Result C(89),Done L. A new message means a new record in this table, such as this command: “Insert into mesqueue (sender, node, message, done) values (“Node2”,”Node1”,”select syd, naziv from local into table 'c:\temp\loc1.dbf' where syd=10000”, .F.). (C)
The interpretation of this command is: from node Node2 send a message to Node1. The message handler (callback function) on Node1 has to read this message, execute the command and return the path to the table with the results.
You have to modify also the program Cluster. It has to read from this table, execute the query ( C ) and write \\node1\temp\loc1.dbf in the field result. After that it has to find the next record “Node1” (the next message) and so on. The application on “Node2” reads the results with a command such as: “use \\node1\temp\loc1.dbf« and put done to true. Using the message we have »distribute work« from Node2 to Node1. On a Node2 we have waited for the result, so execution was synchronous.
The table mesqueue is a message queue and a message log.
In the last example we send a procedure to be executed on the remote node. We can look at it as a remote procedure call (RPC). In a message we put a function (select statement) to be executed on the receiving node. After executing the function, the remote node sends us the results.
You can put a larger number of messages in the message queue (table mesqueue). In the program Cluster you define the order of their execution.
If you execute the last example on a master node instead of »Node1« and send a message »select distinct syd,naziv from dnevnal into table c:\temp\loc1.dbf«, you execute this query on the computer server, it is executed fast (1 second) and you pull from it (over the wire) a small subset of data in the table loc1.dbf. Your server obtains the capabilities of a real database server: the queries are executed on it and not on the client computer. You can transport programs to the server and results from them with simple VFP commands (without direct manipulation of TCP, sockets, threads and protocols).
You can improve your cluster by adjusting the refresh intervals and other parameters.
The program cluster.prg has two functions: 1) creating the replica database, 2) refreshing the replicated data. You can create more programs from it: create the replica, refresh it and execute functions (callback functions).
In the example we have used two refresh strategies. You can add a new one or remove the “orchestra style”: after the master data changes, the services start asynchronous refresh of replica tables. It is enough if you refresh the replicated data on demand, in the form's events.
When you have fresh data of node’s resources, you can build on top of it a load balancing manager. We shall discuss this matter in another article.
VFP has built-in tools you can use in adding new distribution capabilities. The local replica can improve the clients query speed (45 times). VFP makes it easy to set up the parallel replica server; you add a resource table and a simple application. You have to make some changes also to your application to make use of this replica.
The parallel process coordination is built on the top of the resource table, the events and messages. In the same way you can manage other nodes' resources and parallel processes. You can distribute resources (data and execution), manage and load-balance them.
In addition to the various techniques of improving data retrieval in VFP (indexes, normalization, query optimization) we can also include work and data distribution.