6. Operation management
The Limax framework is not only a server/client development environment, but also an operation environment. The provided Switcher, GlobalId, and Auany server components interact with the Provider provided by the client to implement the complete system function. Correctly configure, tune the server's operating parameters; plan the server's interconnection relationship; monitor the server's operation status; and upgrading the version, migrating the data, recovering from the fault, are the tasks which must be finished in the operation management phase.
6.1 Configuration
Most of the various server's configuraton in the Limax framework is implemented through xml description. Reference to the configuration of the basic server components in the source code to modify, the generated corresponding configuration when generating the server adjusts according to the detailed operation environment. Part of the unusual configuration represents as the parameters of the Java virtual machine, and could adjust the parameters during launch.
-
The basic parameters of the xml
-
Properties
Only one property: file, a file which matches the Java Properties specification could be specified, and is provided to the subsequent resolution and affect the resolution to the subsequent part property string. The specific works as follow:
If the attribue string is the $key:value$ format, for example, when resolving the remoteIp property "$auany.ipaddr:127.0.0.1$" in the serveri-switcher.xml, follow the below steps:
1. Split key=auany.ipaddr for searching
2. Search the system Propery, if existed, use the found value as the remoteIp, or
3. Search in the property file provided in the Properties, if existed, use the found value as the remoteIp, or
4. Set the remoteIp as 127.0.0.1.
-
Trace
The system log configuration includes the below properties:
outDir: the log output directory, and the default directory is "./trace".
console: whether allows to output to the console, and the default values is true.
rotateHourOfDay,rotateMinute: rotate log time point, and the default values is the 6 AM. It should be noted that the only the system cross this time point in the running state, the rotate will be executed.
rotatePeriod: the period of the rotate log with the milliseconds as the unit. The default value is 86400000 which means a whole day.
level: the log level, and there are five levels, DEBUG, INFO, WARN, ERROR, FATAL. The default level is WARN, and this property string is case-insensitive.
-
Limit
The configuration of the amount limit, is currently used to control the access number of the ServerManager, and allows the multiple ServerManager to reference to the same limit.
name: the name of the Limit
maxSize: the maxinum number
-
JmxServer
Three mandatory attributes host,serverPort and rmiPort are provided to the JMX to manage the application. The launch url is "service:jmx:rmi://<host>:<serverPort>/jndi/rmi://<host>:<rmiPort>/jmxrmi". Normally, configure the unusual port only if the management application could access.
Attributes username and password is optional.
Normally firewall is needed to block access from Internet when JmxServer is configured.
-
ThreadPoolSize
The parameters of the network service thread pool includes the below properties.
nioCpus: the cpu amount of executing the network Poll. Exceed the number of the system cpu has no meaning and the default values is 1.
netProcessors: the thread amount of the network data receiving/sending, and the default value is 4.
protocolSchedulers: the thread amount of the protocol processing, and the default value is 4.
applicationExecutors: the application thread amount, and the default value is 16.
-
Manager
Correspond to the type="server",type="client" two Managers described in the xml, and describe a network endpoint. Include the below properties.
type: "client" or "server"
The common attributes of the client and the server:
parserCreatorClass: transfer the parsering xml element to the class object assigned by this attribute.
className: the Listener class of the server of the client, is used to process the network message. If not existes, the default Listener is used.
classSingleton: Under the condition that the className attribute existes, if this parameter is set, this parameter must be the static function name in the class definition to obtain the object singleton; if no this parameter, directly create the object of the class.
defaultStateClass: if existe, call this class's getDefaultState static function to obtain the Manager's initial status; if not exist, with the Listener object as the parameter query the Manager's inital status from the class defined by the className.
The above four attributes associate with the generated source code in the development process. And normally, it is no need to care in the operation and maintainance process.
enable: allow to start this manager and the default value is true; it could be set as the disable in some operating scenarios to temporarily disable starting the manager.
name: the name of the endpoint, string.
inputBufferSize: the buffer size of the input, the default value is 16384, and there is no need to modify unless the large network throughput.
outputBufferSize: the buffer size of the output, the default value is 16384, and there is no need to modify unless the large network throughput.
checkOutputBuffer: whether to check the buffer size of the output, if allows to check, when the network congests and the accumulated data exceeds the outputBufferSize, record the warning and close the connection. The default value is false.
inputSecurity: the initial network input stream secret key, requires 16-byte string to represent hexademical number, and the default value is none.
outputSecurity: the initial network output stream secret key, requires 16-byte string to represent hexademical number, and the default value is none.
inputCompress: in the initial status, whether to allow the input stream compress, and the default values is false.
outputCompress: in the initial status, whether to allow the output stream compress, and the default value is false.
asynchronous: the manager work in asynchronous mode or poll mode, default value is false, use poll mode.
The client's specific attributes:
remoteIp: the ip address of the server
remotePort: the port of the server
connectTimeout: the connection timeout, the default value is 5 seconds
autoReconnect: whether to allow the automatical reconnection after the connection fails. If allow, the first time reconnect after delaying 1 second, later successive doubly retreat till 3 minutes, and the 3 minutes is the longest retreat time. The default value is false.
The server's specific attributes:
localIp: the ip address of the server, the default value is 0.0.0.0
localPoft: the port of the server
backlog: the Listen parameter of the server
limit: reference the Limit configuration name, control the server's largest connection number. When exceeding the quantity, record the log, close the follow-up user connection, and delay the follow-up user access. By default, the configuration name is represented as the empty string, and reference a limit of the maxSize=Long.MAX_VALUE together.
autoStartListen: whether to allow the server to automatically monitor the port after launching, the default value is true.
webSocketEnabled: whether to launch as the WebSockect server, suppor the client compatible HTML5, and the default value is false.
If launched as the WebSocket server, the below attributes could be used to support https:
keyStore: the certificate package path of the server packaged with pkcs12 format.
password: keyStore password.
-
NodeService
node.js service component configuration, one property:
module: node module path
A serial of ordered child xml node set, corresponds to the parameters required by the module.
<parameter value="p0"/> <parameter value="p1"/>
-
Switcher
The Switcher element is only used by the server component Switcher to provide the configuration parameter for the connection with the Endpoint.
Five attributes:
cacheGroup:When login success, Switchers synchronize login information by this multicast address. When the connection between Switcher and Auany is broken, this cache is used to authenticate login request. default is empty, forbid this cache.
cacheCapacity:login information cache’s capacity.
needcompress.s2c: whether to compress the Switcher to the Endpoint data stream, the default value is true.
needcompress.c2s: whether to compress the Endpoint to the Switcher data stream, the default vaue is true.
key: the key used when the Switcher initiates the verification to the Auany. Refer the appendix 3 for the detailed description (application configuration).
A serial children xml element set:
<dh group="1"/>
Configure the Diffie-Hellman group allowed by the server. RFC2049, RFC3526.
Implicitly support dhgroup = 1
Configuration supports dhgroup = 2,5,14,15,16,17,18
<native id=”1”/> <ws id="2"/> <wss id="3">
Configure the Switcher id supported by this server. Refer the appendix 3 for the detailed description (application configuration).
-
Auany
The Auany element is only used by the server component Auany, including three types of children element, plat, pay and appstore.
The plat element: the plat element has two main attributes: name and className. The name is case-insensitive and is the authentication platform name, corresponds to the platflag in the login configuration of the Endpoint, the appropriate authentication request is processed by the platform support class appointed by the className.
The platform support class need to implement the limax.auany.PlatProcess interface, and provides two methods.
init: the parameter is the corresponding plat element, the other configuration requried by the platform is expanded in the plat element and analyzed here.
check: the detailed implementation method could refer to the implementation of the limax.auany.plats.Test, and the Test module allows to use any user name but the password must be 123456. The limax.aunay.local.Authenticator is the local authentication mode, which is a little complicated and supports three kinds of common authentication systems, radius, ldap and sql database. In the actual usage, the user just need to modify the configuration according to the example and choose one kind of the authentication system. The authentication system defined in the plat element supports the load balance and error tolerance through RR. In this case, a number of different entries should be placed, the more entires, the smaller the configuration of the timeout. Because in the implementation, if the query of some authentication system could not correctly return the result, including the timeout and failure, the next authentication system is rotated in this case.
The pay element: the pay element has two main attributes --- gateway and className. The gateway is the digital id assigned by the system for the third party payment gateway, and corresponds to the gateway of the Endpoint.AuanyService.Pay. The className appoints the message process class of the third party payment gateway, and this class must implement the interface of the limax.auany.PayGateway.
The appstore element: the basic configuration processed by the appstore receipt.
Furthermore, the <xi:include href="appconfig.xml"/> of the Auany element refers to the appconfig.xml configure file via include way, and provides the application configuration. Please refer to the appendix (application configuration) for the detailed description.
-
GlobalId
The GlobalId is only used by the Provider service, and the configuration is simiar like the Manager with the type="client". Usually, it should configure the autoReconnect="true", and the correct GlobalId server ip address and port. In addition, the attribute timeout could be configured to specify that when the GlobalId requirement timeouts, the effect equals to executing the method limax.provider.GlobalId.setTimeout(long timeout) and the default valus is 2000ms.
-
Provider
Correspond to the Manager with type="provider" described in the xml, and provide the network endpoint configuration for the Provider.
The Provider has at least one built-in client type Manager element, which describes the configuration connecting the Switcher server component from the Provider. Except that the parserCreatorClass, className, classSingleton, and defaultStateClass attributes have no meaning, these Manager elements are the same as the client Manager element. A Provider allows multiple built-in Manager elements to connect multiple Switcher server components. It is usual if it needs to modify the connection scale.
The Provider element has the below attributes:
className: the ProviderListener class, is used to process the Provider message. If it does not exist, the default ProviderListener is used.
classSingleton: in the condition that the className attribute existes, if this parameter is set, this parameter must be the static function name in the class definition to obtain the class singleton; if there is no this parameter, the class object is directly created.
viewManagerClass: the management class of the View.
setAsOnlines: determine whether to start the data service immediately when the Provider launches. The default value is true.
The above four attribues are related to the generated code in the development process. Usually, the operation process does not care about it.
name: the Provider name, is string.
pvid: the PVID parameter of the Provider, determines the service number in the internal system.
key: the secret key string of the Provider, is used to verify the validation of the Provider by the Auany server. The default value is null. When the Auany has the verification request, this string should be modified at the same time.
useVariant: the Provider whether to support the Variant view, is related to the generated code. However, if necessary, this specification could be close through setting this attribute as false.
userScript: the Provider whether to support the script view, is related to the generated code. If necessary, this specification could be close through setting this attribute as false.
paySupportClass: if the Provider needs to support the payment, the payment process class should be provided, and this payment process class must implement the interface of limax.provider.PaySupport.
-
Zdb
The ZDB configuration supports a lot of attributes:
dbhome: the home of the database. If the prefix is the jdbc::mysql, it means that the MYSQL database is used and interpretted as the jdbcUrl, or it is interpretted as the file system path of the EDB database.
preload: table cache preload path. When the zdb stops normally, after the last checkpoint, the cache of all tables is clean. At this point, save the cache content to the local disk. In the next time starting, the cache is initialized by the saved content, which can effectively reduce the startup load in the condition that the backend is mysql. If any abnormality occurs during the loading process, the loading is stopped immediately, so the change of the table structure has no any effect. After the loading is complete, the entire directory is emptied. This property is not configured by default.
edbCacheSize: when using EDB database, this parameter specifies the maximum number of the EDB's cache page. The size of the EDB's data page is 8K. Under the condition that the data throughout is too large, the memory usage might be overlad. Once this number is exceeded, the EDB will automatically checkpoint and release the clean page.
edbLoggerPages: in conditions of using EDB database, this parameter appoints the maximum sum of the page which could be stored by the EDB's log file. The page's sum of the current log file will be checked to make sure whether it exceeds the limitation before checkpoint. If the sum exceeds the limitation, the new log file will be created. If the incremental backup is enable, the previous log file will be copied to the backup directory. If the incremental backup is disable, the previous log file will be simply deleted.
jdbcPoolSize: in conditions of using MYSQL database, the jdbcPoolSize detemines the size of the connection pool, or it is no meaning.
defaultTableCache: select the table cache class, currectly support limax.zdb.TTableCacheConcurrentMap, limax.zdb.TTableCacheNull, limax.zdb.TTableCacheLRU, and the default values is limax.zdb.TTableCacheLRU.
zdbVerify: check whether the usage of the lock in the program is reasonable during running. It could be set as the ture in the test running stage and set as the false in the actual running stage to improve the performance.
autoKeyInitValue, autoKeyStep: the automatic incremental initial value of the key and the automatic incremental growth step of the key. If the requirement of merging the database in the future is expected, the multiple server could consider to use the same growth step and different initial value. Then the table with the automatic incremental key could be directly merged when mentaining and it is not necessary to reset id.
corePoolSize, procPoolSize, schedPoolSize: separately configure the zdb's core thread pool size; the storage procedure thread pool size; the schedule thread pool size.
checkpointPeriod, marshalPeriod, marshalN, snapshotFatalTime: checkpointPeriod determines the frequency that the changed data is stored to the underlying database, the marshalPeriod determines the prepackage frequency of the unlocked changed data before storing to the underlying database, and prepackaging the unlocked changed data could significantly improve the zdb throughout. The serveral prepackage and one storage is acceptable. The marshalN determines the prepackage frequency of the unlocked changed data before final storage. The minmum value of these two frequency is controlled by the virtual parameter limax.zdb.Checkpoint.SCHED_PERIOD, and the default value is 100ms. If the total time of the prepackaging and completely packaging is bigger than the snapshotFatalTime before the final storage, a piece of error log is recorded, in most condition which means the system is over load.
deadlockDetectPeriod: the period of the database checking the deadlock.
The ZDB element could have one Procedure element to configure the parameters related to the storage procedure. The Procedure element has four attributes:
maxExecutionTime: the maximum time of the storage procedure execution. The procedure execution overtime will be reported when the execution exceeds this time. In the common OLTP application, it is not reasonable if the storage procedure execution time is too long.
retryTimes, retryDelay: these two parameters determine the procedure retry times and retreat time after the deadlock.
trace: set the log record level related to the storage procedure.
It is noted that all the configuration attributes of the ZDB have not mentationed the default value, because these default values totally come from the xml description of the application and are stored in the generated program files. When launching application, these configuration could override the corresponding configuration in the program files. Usually, it only need to make a few modification according the operation requirement.
-
-
Switcher
The Switcher configuration could be combined with the previous Properties, Trace, JmxServer, ThreadPoolSize, Manager, Switcher elements. Refer to the service-switcher.xml provided by the Limax source code for the adjustment, including:
The Switcher element must be selected.
In the Manager element
name="ProviderServer" element defines the network specification of this Switcher as the Provider server.
name="AuanyClient" element defines the network specification of this Switcher as the Auany client.
name="SwitcherServer" element defines the network specification of this Switcher as the Endpoint client.
name="SwitcherServerWebSocket" element defines the network specification of this Swicher as the WebSocket server.
For example, if it is necessary to access mulitiple ISP and use several NIC (network interface card), it could clone multiple copies of the Manager elements named as SwitcherServer, adjust the relevant name, configure its own localIp to bind different IP address of the ISP.
-
Auany
The Auany configuration could be conbined with the previous Properties, Trace, JmxServer, ThreadPoolSize, Manager, Auany and ZDB elements. Refer to the service-auany.xml provided by the Limax source code for the adjustment.
The ZDB is used here to store the assignment relationship of the user's SessionId in the whole operation environment.
In the actual operation envionment, the more limax.auany.PlatProcess interface implementation should be created in the Auany framework to provide the support to the more third party platform.
-
GlobalId
The GlobalId configuration could be conbined with the the Properties, Trace, JmxServer, ThreadPoolSize, Manager, and ZDB elements. Refer to the ervice-globalid.xml provided by the Limax source code for the adjustment.
The ZDB is used here to strore the assigned global ID information.
-
Provider
The Provider configuration is combined with the previous Properties, Trace, JmxServer, ThreadPoolSize, Manager, ZDB, GlobalId, and ProviderId elements. Refer to the service-xxx.xml of the generated server source code for the adjustment.
A Provider must have one and only one Provider element.
If the service provided by the Provider uses the GlobalId service, the GlobalId element must be configured. Or it should not be used.
If the service provided by the Provider uses the ZDB, the ZDB element must be configured. Or it should not be used.
-
The parameters of the Java virtual machine
limax.net.io.NetModel.delayPoolSize: the scheduling thread pool size used when network layer triggers various time-out operation. The default value is 1 and it is no need to adjust.
limax.util.ConcurrentEnvironment.timeoutSchedulerSize: the scheduling thread pool size of the executor which allows the timeout and is used for the tiemeout in the storage procedure of the ZDB. The default value is 3 and it is no need to adjust.
limax.zdb.Checkpoint.SCHED_PERIOD: the detecting minimum period of the ZDB's checkpoint. The default value is 100ms and it is no need to adjust.
limax.zdb.Lockeys.bucketShift: the hash bucket size shift parameter of the ZDB's internal lock. The default value is 10, which means that the bucket size is 2<<10 == 1024. It is no need to adjust.
Limax.zdb.Zdb.useFixedThreadPool: boolean type. If it is set as true, it represents that the core thread pool and process thread pool of zdb use a fixed thread pool, in which case the LinkedBlockingQueue queuing task is used to prevent loss. This parameter should not be set unless the storage procedure is generated too much, resulting in an instantaneous heavy load that seriously affects the system response. If this approach is used, the configuration of corresponding pool should be increased into zdb configuration. Setting this parameter should be used as a temporary measure. Modifying the design is a reasonable consideration. It must be aware of that the use of LinkedBlockingQueue queuing tasks, may lead to the hunger which is difficult to be detected, in the condition that there is dependent relationship between the tasks. (The task being performed depends on the task that is still queued).
limax.net.Engine.limitProtocolSize: the hard limitation of the protocol size, which limits the maximum size of all the protocols and View protocols. If the size exceeds, the connection terminates and record the log. The default value is 1048576. Usually, for the size of the changed field of the interactive application View, the sum of the changed field should not be too large one time, because it causes the excessive network traffic and is not good design. For the network transfer application, it could consider to increase this parameter in the Provider and Switcher server.
limax.net.Engine.intranetKeepAliveTimeout: in some cloud environments, the internal network reliability can not be guaranteed. This parameter could be used to configure the timeout period for keepalive detection between the servers. Once timeout, close the connection. The server (usually the Switcher) as the server, clears the corresponding state's information, and recovers to the initial state. The server (usually the Provider) as the client, in the condition that the disconnect reconnection is configured, tries to recover to the correct state via auto-connection. The default value is 0ms and without detection.
limax.switcher.SwitcherListener.handShakeTimeout: the allowed time window from that the Switcher accepts the Endpoint connection to the Switcher processes the Endpoint handshake requirement. The default value is 1000ms. Unless the load of the Switcher server is too heavy or the performance of the client is too low, this limitation could not be met. In the condition that the hardware performance of the server is not the problem, this parameter in the Switcher server could be properly decreased which is beneficial to defend the attack better.
limax.net.WebSocketServer.handShakeTimeout: the previous parameters has no meaning to the server runnning in the WebSocket mode, because the design of the WebSocket protocol determines that after finishing the first HTTP request, the WebSocket data exchange status is switched, and the required authentication parameters are all ready. In the WebSocket mode, this parameter has the same meaning as the previous parameters.
limax.net.WebSocketServer.maxMessageSize: the maximum message size allowed by the WebSocket server. The default value is 65536. If it needs to interact the huge message via WebSocket, this parameter in the Switcher server could be increased. Howerver, the excessive setting is not beneficial for the Switcher to defend the traffic attack.
limax.net.WebSocketServer.keyExchangeTimeout: in the extended websocket mode, the key exchange timeout, and the default value is 3000ms.
limax.net.WebSocketServer.dhGroupMax: in the extended websocket mode, the max dhgroup while key exchanging, and the default value is 2.
limax.net.secureIp: if the Switcher is in the DNAT environment, the external network ip is assigned via this parameter to ensure that the key negotiation could be carried out correctly in the native mode or extended websocket mode.
limax.switcher.SwitcherListener.sessionLoginTimeout: the allowd longest time from the successful handshake between the Switcher and the Endpoint till the nomal online. The default value is 20000ms. Exceeding this duration means that the communication between the Switcher and the Endpoint or the Auany has the problem. Usually, it is no need to adjust.
limax.switcher.SwitcherListener.keepAliveTimeout: the allowed maximum interval that the Switcher receives the Ping protocol due to that the Endpoint periodly sends the Ping protocol to the Switcher. When this interval is exceeded, the Switcher close the network connection. The default value is 60000ms.
limax.switcher.SwitcherListener.pingProtect: the protection period of the Switcher's Ping. In the protection period, if the Switcher receives multiple Ping protocols, the server close the network connection. The default value is 30000ms.
The previous two parameters are related to the Endpoint, all the Endpoint in the release version uses the ping duration with 50000ms, which is in the window determined by these two parameters. If it needs to be adjusted, it is necessary to set some tolerance.
limax.node.js.EventLoop.corePoolSize: the minimum number of threads for the event loop thread pool of the node.js framework. All the Cluster shares the same thread pool and the default is 64.
limax.node.js.modules.Dns.corePoolSize: in the node.js framework, the DirContext pool capacity of dns module, the default is 16. Unless the application requires too much concurrent dns query, it is no need to adjust generally.
limax.node.js.module.Net.TLSExchange.concurrency: in the node.js framework, when starting the TLS support on the socket object of net module, the default of the amount of concurrent use of SSLEngine is 32. The heavy load TLS server can experimentally increase this value.
limax.node.js.module.Sql.ConnectionFadeout: in the node.js framework, the connection in the connection pool of the sql module fade out timeout, the default is 60000ms, and it is no need to adjust generally.