Ching-Chuan Chen's Blogger

Statistics, Machine Learning and Programming

0%

How does R Process Existed Connections When Exit?

這篇主要是介紹在R裡面在離開Process時是怎麼處理Process的連線。

熟悉R的人都知道showConnections(all = TRUE)會show出這個Process目前的所有連線為何

基本上一定會有stdinstdoutstderr

而且在connections.c裡面會定義最大連線數

基本上,R的最大連線數都是128,除非自己編譯R,才可能取得更高的連線數量,這個數字其實還有很多意義

其中之一是,如果有用parallel或是snow都是R可以透過ssh多台電腦來spawn slaves

而slaves的最大數目就是這個數字減3(需要扣掉stdinstdoutstderr)

扯得有點遠了,我們今天的主題是介紹R裡面怎麼處理這些Connection的

要知道這個,我們首先從connections.c下手

R_new_custom_connection這個函數上面可以看到兩個註解的block,裡面寫到:

/* — C-level entry to create a custom connection object – /
/
The returned value is the R-side instance. To avoid additional call to getConnection()
the internal Rconnection pointer will be placed in ptr[0] if ptr is not NULL.
It is the responsibility of the caller to customize callbacks in the structure,
they are initialized to dummy_ (where available) and null_ (all others) callbacks.
Also note that the resulting object has a finalizer, so any clean up (including after
errors) is done by garbage collection - the caller may not free anything in the
structure explicitly (that includes the con->private pointer!).
*/

我們重點是裡面提到每一個connection的object都會有一個finalizer,所以所有包含錯誤的clean up相關動作就會被正確處理掉

那我們看一下R_new_custom_connection,其實裡面就會有R_RegisterCFinalizerEx的動作

我們先往下挖,而它會註冊到conFinalizer這個函數,然後會連結到con_destroy

接著到con_close1,然後call con裡面對應的close method完成關閉

但是我們怎麼知道R離開時會call finalizer呢?

我們就要深入去看memory.c中的R_RegisterCFinalizerEx

我們循著脈絡來往下找R_RegisterCFinalizerEx -> R_MakeWeakRefC -> MakeCFinalizer,所以最後會註冊到CFinalizer

我們再從取的CFinalizer的function GetCFinalizer往上找,就會看到R_RunWeakRefFinalizer -> RunFinalizers

看一下RunFinalizers裡面的註解

/** A top level context is established for the finalizer to
insure that any errors that might occur do not spill
into the call that triggered the collection.
**/

它提到context是為了finalizer而被建立的,用來確認說在任何錯誤發生時,能夠trigger資源回收

最後,我們的R Source Code Trace之旅就結束了

我們看到說R針對它自己離開時,會在不論什麼情況下,回收掉所有的connections,避免connection殘留