追求新境界: hacking

顯示具有 hacking 標籤的文章。顯示所有文章

2011年3月28日星期一

simple & clean daemon code

This blog is original from here: [ uCdot ]

Here the little trick is :
Use option '-D' as a 'differentiate mark' of daemon process
from the parent process.

The second time it self exec(), means the daemon process
starting with -D option, which make it goes to the 'else' block :

It's a Clean and Simple example!
#

2011年3月11日星期五

linux serial port programming : CLOCAL 的設定

先來看 Carrier Detect (CD) signal line 的意義。
節錄自 "Serial Communication / Mark Nelson" 一書 :
"Unfortunatly, before the era of intelligent modems, some terminals
or other DTE equipment were designed to treat a modem as unusable
without CD. Devices such as these would not send or receive
characters to a modem that had CD low. Because of these anachronisms,
most modems built today can keep CD high at all times, whether or not
a carrier has been established.
Because these feature is sometimes the default mode of operation,
using CD for accurate detection of carrier presence is somewhat risky.
However, most users should be able to configure their modems so as to
disable this troublesome behavior."

中文口白一下：以前時代，CD 是 modem (DCE 端) 用來告訴終端機 (DTE 端) 說：
我有連到對方喔！但是後續的 modem (DCE) 有的已經不照這種約定了---
不管有沒有跟遠端連線，一律拉 CD 起來。
所以 DTE 根據 CD 判斷 DCE 有無與對方連線的方式，已經不可靠了。
所以現在 DTE 必須可以 configure設定，不要再看這個不可靠的 CD signal.

怎樣 configure ? Linux serial driver 提供這樣的 flag --- termios 屬性若有設定
CLOCAL ，在 2.6.28 generic_serial.c ::
uart_change_speed():
{    ...
    if (termios->c_cflag & CLOCAL)
        state->info->flags &= ~UIF_CHECK_CD;
    ...
}

那 UIF_CHECK_CD 有何作用呢？

uart_handle_dcd_change(struct uart_port *port, unsigned int status)
{ ...
    if (info->flags & UIF_CHECK_CD) {
        if (status)
            wake_up_interruptible(&info->port.open_wait);
        else if (info->port.tty)
            tty_hangup(info->port.tty);
    }
    ...
}

也就是說，有 UIF_CHECK_CD 時，就 wake up 正在等待 open 的 process，
或是 hangup tty，這些都是在 CD signal change 時本來預定的動作。

所以在這裡，設定 CLOCAL 可以 "遮蔽" 對於 CD 訊號的判斷。
若您的應用程式 open serial port 時，忘記設定 CLOCAL，
就會照者書上所說的操作定義 --- 因為 DTE 發現 DCE 的 CD 不見了，
就 hang up tty 了。
#

2010年11月19日星期五

一些 C preprocessor 的用法

在 wireshark source 裡面，看到有些奇怪的 C 語言寫法 : 查不到定義的 _U_。在這裡當備忘錄記一下：
hostname$ cat foo.c
int foo(int a _U_, int b _U_, int c)
{
return c + 17;
}
#hostname$ gcc -D_U_= -c -O2 -Wall -W foo.c
foo.c: In function `foo':
foo.c:2: warning: unused parameter `a'
foo.c:2: warning: unused parameter `b'

上面定義 _U_ 為"什麼東東都沒有" ，結果出現 compiler warning.

#hostname$ gcc -D_U_="__attribute((unused))" -c -O2 -Wall -W foo.c
#hostname$ gcc --version
2.95.1f

這裡 _U_　被定義為 attribute of "suppress unused variable warning"，所以就沒有 compiler warning 了！
那為何不使用的參數不拿掉呢？不知道耶！猜想是為了將來需要時不用動到很多 callers 就可以修改 function code 吧！

2009年12月3日星期四

sparse type checking

因為 compiler 只能檢查 syntax.
Linus 為了檢查程式的 semantics ，創造出 sparse. 在 [ Linux Journal "Linus & Lunatics" ] 這篇文章提到了他當初的想法：

相關的 code 在 compiler.h 可以看得到 (kernel source 2.6.28 )

#ifdef __CHECKER__
# define __user        __attribute__((noderef, address_space(1)))
# define __kernel    /* default address space */
# define __safe        __attribute__((safe))
# define __force    __attribute__((force))
# define __nocast    __attribute__((nocast))
# define __iomem    __attribute__((noderef, address_space(2)))
# define __acquires(x)    __attribute__((context(x,0,1)))
# define __releases(x)    __attribute__((context(x,1,0)))
# define __acquire(x)    __context__(x,1)
# define __release(x)    __context__(x,-1)
# define __cond_lock(x,c)    ((c) ? ({ __acquire(x); 1; }) : 0)
extern void __chk_user_ptr(const volatile void __user *);
extern void __chk_io_ptr(const volatile void __iomem *);
#else
# define __user
# define __kernel
# define __safe
# define __force
# define __nocast
# define __iomem
# define __chk_user_ptr(x) (void)0
# define __chk_io_ptr(x) (void)0
# define __builtin_warning(x, y...) (1)
# define __acquires(x)
# define __releases(x)
# define __acquire(x) (void)0
# define __release(x) (void)0
# define __cond_lock(x,c) (c)
#endif

平常 gcc compile 時，不定義 __CHECKER__，所以這裡的許多 macros 就得到空定義 (程式碼 #else 部份)
那麼使用 sparse 時，就定義 __CHECKER__ ，(程式碼 #ifdef 部份) 舉例來說第一行:

# define __user        __attribute__((noderef, address_space(1)))

使得 sparse 檢查 __user 定義的變數，是否落在 user space，例如用在 copy_to_user() ：

unsigned long
copy_to_user(void __user *to, const void *from, unsigned long n)
{...}

compiler 是不會知道 "to" 的變數是不是正確！但是 sparse 檢查時，若發現 to 所指的不是在 user space 裡，就會發出 warning
因此這些 attributes 是給 sparse 用的，在 gcc 文件裡面查不到這些 attributes (是的，不要怪 gcc 文件寫不好...)

另外這篇 [ linux 內核中 compiler.h 文件的分析 ] 還解釋了 __acquire / __release 必須成對使用，否則 sparse 會發出 warning.

Linus 真神人也，可以讓 kernel source 藉由 sparse 工具來檢查 program semantic !!Technorati 標籤: sparse, c type checking

2009年9月3日星期四

kernel 隨手記 --- 類似 oops 的系統訊息

void __this_fixmap_does_not_exist(void)
{
    WARN_ON(1);
}

在 kernel 裡看到這樣的程式碼，用在哪裡呢？
static __always_inline unsigned long fix_to_virt(const unsigned int idx)
{
    if (idx >= __end_of_fixed_addresses)
        __this_fixmap_does_not_exist();

    return __fix_to_virt(idx);
}
(這裡為了精簡，把註解給拿掉了。)這是不希望發生 if 為真的情形，那如果發生了，就印出像 oops 的狀態，以方便偵錯！所以關鍵在 WARN_ON(1)，往下追蹤，發現呼叫到 warn_on_slowpath(__FILE__,__LINE__) 來處理，裡面用了幾個好用的技巧：
1. __builtin_return_address(0) 請 compiler 找出現在 function 的 return address。
2 sprint_symbol() 可以幫忙找出 symbol name , dump_stack() 可以列印現在的 stack 。
3 這個 WARN_ON() 不像 panic() 會中止系統，理論上，可以用來印出發生問題時的 kernel 狀態 ! 或許在開發 driver 可以用來除錯

2009年8月4日星期二

More on QEMU

兩年前在 [QEMU 安裝與試跑] 探討了模擬 ARM Versatile 系統。原以為這樣就會使用 QEMU 了，最近兩天為了模擬 x86 不成功，只好重新拾起 [QEMU Documentation] 仔細再 K 一下，補足原本欠缺的觀念，以下紀錄的是該準備的 sources & images ：

linux-2.6.28 由 [kernel archive] 下載解壓的，
linux-0.2.img.bz2 由 QEMU Documentation | [download] | 下載得到，
bzImage-2.6.28-test 由 linux-2.6.28 編好之後所得，
initrd.img-2.6.28-test.bz2 由 linux-2.6.28 編好安裝模組 (make modules_install) 後，利用 update-initramfs -c -k 2.6.28-test 所得到的
檔案系統 (bunzip2 -c initrd.img-2.6.28-test.bz2 | cpio -idm)

以下紀錄了幾次錯誤的啟動方式：
#> qemu -kernel bzImage-2.6.28-test
==>A disk image must be given for 'hda' when booting a Linux kernel
(if you really don't want it, use /dev/zero)
#> qemu -kernel bzImage-2.6.28-test -hda /dev/zero
==> booting from Hard Disk
==>failed : not a bootable disk
==> no bootable device
沒頭緒，翻翻線上手冊好了---
在 [QEMU Documentation] 3.8 節 Direct Linux Boot 提到：
`...It is very useful for fast Linux kernel testing.
The syntax is :
qemu -kernel arch/i386/boot/bzImage -hda root-2.4.20.img -append "root=/dev/hda" `

"...Use ‘-kernel’ to provide the Linux kernel image and ‘-append’ to give the kernel command line arguments. The ‘-initrd’ option can be used to provide an INITRD image.
When using the direct Linux boot, a disk image for the first hard disk ‘hda’ is required because its boot sector is used to launch the Linux kernel..."

所以我們得準備一個 disk image，也就是剛才下載的 linux-0.2.img，讓 QEMU 透過 disk image 的 boot sector 載入 kernel image，並且傳 kernel parameter "root=/dev/hda" ，告訴 kernel 說 root filesystem 在 disk image 上！

#>qemu -kernel bzImage-2.6.28-test -hda linux-0.2.img -append "root=/dev/hda"

終於成功，顯示出的 kernel 為餵進去的 2.6.28-test.，注意 disk image 不能用壓縮的 bz2 or gz ，否則會發生 waiting for root filesystem...的現象...殘念！

結論：在模擬 x86 上，需要有 disk image (即使不使用 disk image 上的 kernel image 也一樣)，跟模擬 ARM 有所不同。

2009年5月26日星期二

RCU 初步認識

從 Paul McKenney 的[網站] 有 [RCU 的基本介紹] 一文，趁著爬文後還有印象，趕快整理一下重點與常見問題：
1. RCU 的需求：
RCU (Read Copy Update) 是一種 synchronization 機制，允許多個 reads to occur concurrently with updates，它本身不使用 lock 機制，在 kernel 內，適合用在 reader 多但 writer 少的資料保護。因為現在的系統多半都有多個 CPUs，RCU 機制允許多個 readers 的 scability 特性(無限個 readers，而且進入 reader-side critical section 是非常的 low cost)，使得它在這些情形下，比用 lock 有更好的 performance。

2. RCU 的工作方式及術語：(資料來源 : [RCU Wiki] )
RCU 把 update 分成兩個 phase: "removal" and "reclamation".
Removal phase 移除某個 data 指標，此時允許多個 readers 繼續進行，readers 可能看到兩種版本的資料(移除該 data 前跟移除後) 因為是更改指標，所以只要確保更改指標的動作是 atomic operation ，就不會破壞到資料。
Reclamation phase 時，只要確保沒有 reader 會再參考該 data 之後，就可以安全的移除該資料---這涉及兩個條件：一、所有 reader 讀取資料的動作是 link list 的單一方向的拜訪，不回頭。二、所有的 reader 執行 "reader-side critical section" 內的動作時不能 block 或 preempt，這樣我們的 reclamation phase 可以簡單的設計成：

void synchronize_rcu(void) {
   int cpu;
   for_each_cpu(cpu);
   run_on(cpu);
}

只要令每個 CPU 都輪流完，就可確保下列情形：所有在 synchronize_rcu() 的執行時間之前，所有參考到舊版本(未移除時)的 readers 都保證結束，在這個時間點後，所有的 readers 都將只看到新版本(已移除該 data) 的資料串列了！這段等待所有 CPU 跑完的時間稱作 "Grace Period"，在歷經 grace period 後，我們就可以確定該 data 已經沒有任何的 reader 參考了，可以放心的 free 掉。(即 reclamation).

剛才所討論的是 delete 的 RCU 動作， add/replace 的 RCU 動作原理也是依此類推。

3 一些 RCU 的 API :

假設我們有一片段程式，updater 動作如下:
例一：
1 struct foo {
2   int a;
3   int b;
4   int c;
5 };
6 struct foo *gp = NULL;
7
8 /* . . . */
9
10 p = kmalloc(sizeof(*p), GFP_KERNEL);
11 p->a = 1;
12 p->b = 2;
13 p->c = 3;
14 gp = p;

這裡不能保證 11-14 行不會被 compiler reorder，所以我們改為：

1 p->a = 1;
2 p->b = 2;
3 p->c = 3;
4 rcu_assign_pointer(gp, p); 裡面加了 memory barrier 保證 pointer assign 發生於 line 1-3 之後.

同樣的，在 reader 一方，reader 也需要 memory barrier:
例二：
1 p = gp;
2 if (p != NULL) {
3   do_something_with(p->a, p->b, p->c);
4 }
在大部分的 arch 下是正確的，但是在 DEC Alpha 還是會有可能 p->a, p->b, p->c 發生得比 fetch p 來得早，
所以應該改成：
例三：
1 rcu_read_lock();
2 p = rcu_dereference(gp);
3 if (p != NULL) {
4   do_something_with(p->a, p->b, p->c);
5 }
6 rcu_read_unlock();

這裡 rcu_dereference()裡面有memory barrier，保證 p->a,p->b,p->c僅發生在 p 被 fetch 之後。
另外 rcu_read_lock() rcu_read_unlock() 也是必要的，它們標明了 reader-side critical section，在 preempt kernel 中，
他們的動作就是暫時 disable preempt(前面的條件二)，在 non-preempt kernel 中，就沒事做，展開成空的 macro.

先跳來看一下 rcu_dereference()
#define rcu_dereference(p)     ({ \
                typeof(p) _________p1 = ACCESS_ONCE(p); \
                smp_read_barrier_depends(); \
                (_________p1); \
                })
其中 ACCESS_ONCE 確保 p 不被最佳化動作影響，可以確實讀取，smp_read_barrier_depends() 意思為 "flush all pending reads that subsequents reads depend on"（使 barrier 之後的 reads 動作不受 barrier 之前的 reads 所影響).

ok 言歸正傳，既然 pointer 有包裝了 memory barrier。因為 rcu 常用於 list operation，所以 list 動作中，也有 rcu 版本的 API:

list_add_rcu()
list_del_rcu()
list_replace_rcu()

這些 macro 中也包裝了 memory barrier。

把各種 API 做個表格來看就清楚了：
===============================================================================================================
Category          Publish                    Retract                              Subscribe

Pointers          rcu_assign_pointer()       rcu_assign_pointer(...,NULL)            rcu_dereference()

Lists              list_add_rcu()             list_del_rcu()                         list_for_each_entry_rcu()
                   list_add_tail_rcu()
                   list_replace_rcu()
===============================================================================================================
接下來看看 list 的 example，這是 updater 的程式片段:
例四：
1 struct foo {
2   struct list_head list;
3   int a;
4   int b;
5   int c;
6 };
7 LIST_HEAD(head);
8
9 /* . . . */
10
11 p = search(head, key);
12 if (p == NULL) {
13   /* Take appropriate action, unlock, and return. */
14 }
15 q = kmalloc(sizeof(*p), GFP_KERNEL);
16 *q = *p;
17 q->b = 2;
18 q->c = 3;
19 list_replace_rcu(&p->list, &q->list);
20 synchronize_rcu();
21 kfree(p);
其中第 16 行就是 read-copy，第 17-19 行就是 update 動作。

list_replace_rcu() 如下：

static inline void list_replace_rcu(struct list_head *old,
                struct list_head *new)
{
    new->next = old->next;
    new->prev = old->prev;
    smp_wmb();
    new->next->prev = new;
    new->prev->next = new;
    old->prev = LIST_POISON2;
}
看得出來，是 list 代換加上內嵌一個 memory barrier.

好的，來看看 reader 的動作：

例五：
list_for_each_rcu(p, list_head) {
rcu_read_lock();
if (need_to_reference(p)) {
    reference_without_blocking(p);
    rcu_read_unlock();
    break;
}
rcu_read_unlock();
}

rcu_read_lock()/rcu_read_unlock() pair 是標明 reader critical section(CS)。在 CS 裡面，如果要參考某個 link data，有個條件剛剛提過---就是 reader 不能 blocking !!

總結： RCU algorithm 的動作大抵有三個部份：
1. publish-subscribe 機制，用來給 reader 參考的，如例三與例五。
2. 於 updater 端則是：waiting for pre-existing RCU readers to finish，如例四的 synchronize_rcu()，and
3. maintain multiple versions to permit change w/o harming concurrent RCU readers 就是前面說的兩個條件！

問題 1： seqlock 也是 synchronize 機制，與 RCU 比較有何不同？
Ans: seqlock 當遇到 updater update 資料時，會強迫 readers retry reading，但是 RCU 則不會。

問題 2：當 reader 執行 list_for_each_entry_rcu() 的同時，若是(假設另一個 CPU上) updater 也正在執行 list_add_rcu()
( 或是 list_del_rcu(), list_replace_rcu() 等) 更改資料，這樣不會有問題嗎？
Ans: 因為在 Linux 系統上，load/store pointer 是 atomic operation，所以 list_for_each_entry_rcu() 參考到的資料，
可能會是原始版本的資料，或是新版本的資料，這兩種其中之一；而不會是資料 inconsistent 的情形。並且，
list_for_each_entry_rcu() 是一路前進的參考，不會回頭，所以看到的可能是新資料 or 舊資料。
之後 updater 會讓所有參考到舊資料的 readers 保證可以結束 read critical section，然後才把該筆舊資料移除或置換掉，其餘的 reader 都將讀到新資料，所以不會有問題。

這裡有個跟以前觀念上不同之處，我們以前的 CS 是保證唯一占用被保護的資源。
但是 RCU Reader 進入 CS 之後，並不保證他是讀到新或舊資料。讀到新舊資料的分野，是在於 updater 執行 synchronize_rcu() 的時間點，對所有的 readers，在該時間之前，所有的讀取都讀到舊資料！在該時間之後，所有的讀取都將讀到新資料！

Technorati 標籤: RCU, kernel code, synchronization

2009年4月20日星期一

慣C用法

這頁紀錄著一些 kernel 裡面看到的慣C用法:

1. 這種用法好像 C++ 喔!

static inline pte_t native_make_pte(pteval_t val)
{
      return (pte_t) { .pte = val };
}

p.s: pte_t 的定義要看 Phisical Address Extension (PAE) 有沒有打開，打開的話，就是一個 64 bit 的 structure, 沒有打開的話，就是一個 32 bit 的 structure:

#ifdef CONFIG_X86_PAE
typedef     u64    pteval_t;
typedef     union {
          struct {
              unsigned long pte_low, pte_high;
          };
          pteval_t pte;
} pte_t;
...
#else /* !CONFIG_X86_PAE */
typedef unsigned long    pteval_t;
typedef union {
    pteval_t pte;
    pteval_t pte_low;
} pte_t;
#endif

2.常見的 per_cpu()

#define per_cpu(var, cpu)\
          (*SHIFT_PERCPU_PTR(&per_cpu_var(var), per_cpu_offset(cpu)))
其中
#define SHIFT_PERCPU_PTR(__p, __offset)    RELOC_HIDE((__p), (__offset))

#define RELOC_HIDE(ptr, off)                    \
({ unsigned long __ptr;                    \
    __asm__ ("" : "=r"(__ptr) : "0"(ptr));        \
    (typeof(ptr)) (__ptr + (off)); })

the `ptr' constraint "0" means `ptr' will use the same constraints with the 0th variable,
i.e. the `__ptr'. Which has the constraints of `='(output) and `r'(use register for the variable).
This inline asm means : __ptr = (unsigned long) ptr; and calculate the offset
in unit of `long' instead of `typeof(ptr)', finally return the calculated offset with the type `typeof(ptr)'

now let's see the other 2 defines:
#define per_cpu_var(var) per_cpu__##var

extern unsigned long __per_cpu_offset[NR_CPUS];
#define per_cpu_offset(x) (__per_cpu_offset[x])

所以 per_cpu(var,cpu_id) 展開會得到 per_cpu__##var /*注意：雙底線*/ + __per_cpu_offset[cpu_id]

例如 sys_ioperm() 裡面有這一個 C statement:
     tss = &per_cpu(init_tss, get_cpu());
展開 macro 就會得到
     tss = per_cpu__init_tss + __per_cpu_offset[get_cpu()];

3. 超簡潔的 struct 初始化

processor.h 裡頭有這樣的用法：
#define INIT_TSS {\
    .x86_tss = {\
        .sp0        = sizeof(init_stack) + (long)&init_stack,\
        .ss0        = __KERNEL_DS,\
        .ss1        = __KERNEL_CS,\
        .io_bitmap_base    = INVALID_IO_BITMAP_OFFSET,\
    },\
    .io_bitmap        = { [0 ... IO_BITMAP_LONGS] = ~0 },\
}

這個 INIT_TSS 將來會 assign 給變數型別是 struct tss_struct ，該型別定義如下：

struct tss_struct {
    /*
    * The hardware state:
    */
    struct x86_hw_tss    x86_tss;

    /*
    * The extra 1 is there because the CPU will access an
    * additional byte beyond the end of the IO permission
    * bitmap. The extra byte must be all 1 bits, and must
    * be within the limit.
    */
    unsigned long        io_bitmap[IO_BITMAP_LONGS + 1];
    /*
    * Cache the current maximum and the last task that used the bitmap:
    */
    unsigned long        io_bitmap_max;
    struct thread_struct    *io_bitmap_owner;

    /*
    * Pad the TSS to be cacheline-aligned (size is 0x100):
    */
    unsigned long        __cacheline_filler[35];
    /*
    * .. and then another 0x100 bytes for the emergency kernel stack:
    */
    unsigned long        stack[64];

} __attribute__((packed));

struct x86_hw_tss {
    unsigned short        back_link, __blh;
    unsigned long        sp0;
    unsigned short        ss0, __ss0h;
    unsigned long        sp1;
...省略
} __attribute__((packed));

這樣用法相當簡潔，連同 sub structure 的初始化都可以一口氣完成！

2009年2月15日星期日

libusb 的底層追蹤 (libusb thread support and the relation with kernel usbfs)

因為做 project，無意間找到了Greg KH 大師級的文章-- [Snooping the USB Data Stream] 文中有一段提到 kernel 對 usbfs 的支援，讓 application 可直接透過 usbfs 對 device 發出 usb transfer，實做於devio.c, inode.c, and devices.c 等三個 kernel sources. （note: 我這裡的 kernel version 是 2.6.26）

而另一方面，在 application library 端，就是依靠 usbfs 的幫忙，發展出 [libusb project] 1.0.0 版，但在 debian testing 的套件裏，還只包到 libusb-0.1.12，若要 library header 檔，請安裝 libusb-dev 套件。

從 0.1.12 到 1.0.0 做了許多改變，除了 API 幾乎重新定之外，最明顯的是增加了 thread 的支援，但是兩者在最底層的系統呼叫，都是使用了 usbfs 提供的 I/O control。因為 1.0.0 的設計蠻複雜的，所以我還是以 0.1.12 來追蹤（其實是功力不夠啦！）以下就是一些 0.1.12 追蹤的筆記：

in libusb-0.1.12:

usb_bulk_read()          usb_interrupt_read()
usb_bulk_write()   與    usb_interrupt_write()
       :                      :
       V                      V
   USB_URB_TYPE_BULK      USB_URB_TYPE_INTERRUPT
                 :           :
                 :           :
                 V           V
               usb_urb_transfer()

usb_urb_transfer() 大致上流程，僅僅提供了 synchronous 的傳送方式(就是呼叫之後就等待它完成)---在 0.1.12 的介面上並沒有提供 asynchronous 的方式（就是呼叫後就離開，將來 urb 收/送完成後，系統會呼叫 complete function），注意：這是 libusb-0.1.12 並沒有提供 asynchronous 的函數，但是 kernel 的 IOCTL_USB_SUBMITURB 工作方式卻都是 asynchronous 的動作，等下追蹤 kernel 的部份時就會知道了。
usb_urb_transfer() 用 IOCTL_USB_SUBMITURB 送出 urb 之後，然後一直重複使用 IOCTL_USB_REAPURBNDELAY 來收取completed urb ，並且把使用者傳入的 timeout 切成 1ms 的單位用 select() 來等待。結果有幾種：
1. select() 等到了 I/O 動作，REAPURB 得到了某個 completed urb ，返回值是所收送的 data 長度。
2. select() 等不到 I/O 動作，重複 1ms 的 select()等待，一直到 timeout 了，返回值是 -ETIMEDOUT

usb_urb_transfer() 有一段註解，是這樣說的，直接節錄下來：

#define URB_USERCONTEXT_COOKIE        ((void *)0x1)

/*
   * HACK: The use of urb.usercontext is a hack to get threaded applications
   * sort of working again. Threaded support is still not recommended, but
   * this should allow applications to work in the common cases. Basically,
   * if we get the completion for an URB we're not waiting for, then we update
   * the usercontext pointer to 1 for the other threads URB and it will see
   * the change after it wakes up from the the timeout. Ugly, but it works.
   */

雖然這裡說 Threaded application 可以 work ！根據這段註解說，使用了 urb.usercontext 來"標示" reapped urb---若不是我們的 urb 就把該 urb.usercontext 設定為 URB_USERCONTEXT_COOKIE，以便讓另一個 thread 可以 reap：但是另一個 thread 可以 reap 到這個作了 cookie 記號的 urb 嗎？

追蹤到這裡，我們不得不往 kernel 的 devio.c 追蹤，要徹底了解 kernel 提供 usbfs 的動作才能解答這個問題...

首先從 IOCTL_USB_SUBMITURB 開始找線索，因為這個是 libusb 定義的 I/O control code，kernel 裏相對應的是 USBDEVFS_SUBMITURB I/O control code，負責處理這個 I/O control code 的是 proc_submiturb(ps, p); 其中 ps 與 p 分別是

   struct dev_state *ps = file->private_data;
   void __user *p = (void __user *)arg;

arg 是 user 由 ioctl system call 傳入的 argument pointer，這裡傳入 user urb。
ps 是 usbdev_open() 時 allocated 得到的，定義為：

struct dev_state {
    struct list_head list;      /* state list */
    struct usb_device *dev;
    struct file *file;
    spinlock_t lock;            /* protects the async urb lists */
    struct list_head async_pending;
    struct list_head async_completed;
    wait_queue_head_t wait;     /* wake up if a request completed */
    unsigned int discsignr;
    struct pid *disc_pid;
    uid_t disc_uid, disc_euid;
    void __user *disccontext;
    unsigned long ifclaimed;
    u32 secid;
};

可以把這個資料結構看成 process 對於這個 device 的傳送 urb 的狀態紀錄. 其中兩個 list 分別是 urb 送出後就把對應的 async 由 async_pending 紀錄，等 urb complete 之後就把對應的 async 由 async_completed 紀錄，async 是甚麼呢？對於每個 urb 都有一個對應的 async data structure，是在 proc_do_submiturb() 時 allocate 得到的。async 的資料結構為：

struct async {
    struct list_head asynclist;
    struct dev_state *ps;
    struct pid *pid;
    uid_t uid, euid;
    unsigned int signr;
    unsigned int ifnum;
    void __user *userbuffer;
    void __user *userurb;
    struct urb *urb;
    int status;
    u32 secid;
};

對每個 urb 都有一個 async 紀錄，在 proc_do_submiturb() 時 allocate 得到，同時它也會紀錄 user urb 的位置，將來可以把 urb 所得的資料 copy 回 user urb。

整個 urb 的流程為：
proc_submiturb(): 把 user's urb(user space) 拷貝一份到我們的 uurb (kernel space)
->proc_do_submiturb(): 根據 bulk,interrupt..等 type 分別 initial 一些欄位，然後 allocate async data structure (裡面還包含 urb)，然後放入 ps->async_pending queue 做紀錄，接著呼叫 usb_submit_urb() （之後就交給 usb host controller 處理了）然後不等結果就返回，所以我們說 kernel 這裡是以 asynchronous 的方式處理 urb !

等 usb host controller 把 urb 處理完後，會呼叫 async_complete()，async_complete() 將 async 紀錄從 ps->async_pending 移到 ps->completed.

另一方面 user 要透過 IOCTL_USB_REAPURBNDELAY "收割" 已完成的 urb ，對應到 kernel 為 USBDEVFS_REAPURBNDELAY。這個 I/O control 會呼叫 proc_reapurbnonblock()，它會巡視 ps->completed 是否有 async 紀錄，若無則返回 -EAGAIN，若有則把找到的 urb (kernel space) 資料拷貝到 user space，並設定 IOCTL_URB_REAPURBNDELAY 時傳入的 arg 指向 user space urb，至此，kernel 傳送的部份已經完成。

好，kernel 的部份至此大致了解，接下來討論我們的疑問，我們分兩種情形討論：
1. 每一個 thread 都使用同一個 open handle, 也就是在 kernel 裡面同一份 ps：
當 IOCTL_USB_REAPURBNDELAY 時，在 kernel 裡，會取出 ps->completd 上已完成 urb ，但是並不知道是哪一個 thread 的 urb ，因此 libusb 使用 cookie 作記號；在 usb_urb_transfer() 時，每一個 thread 在自己的 thread stack 上宣告一個 user space urb，如果第一個 thread reap 到不屬於自己的 urb 就打上 cookie 並繼續從 kernel reap 其他的 urb，此時當第二個 thread reap urb 時，就從 ps->completed 再抓一個 ... of course ，這樣第二個 thread 就收不到該收的 urb 了，錯誤就產生了。

2. 每一個 thread 各自開了 open handle, 也就是在 kernel 裡面對應各自的 ps：
這樣每個 thread 都對應了自己的 ps->completed list，所收的 urb 不會混淆，好像也不需要 cookie 了喔！但是這樣有一個缺點：bulk transfer 之前要 claim interface---就是每個 thread 要使用 usb_urb_transfer() 時必須先 claim interface...，所以原來已經 claim interface 的 thread 要 release interface，那就要把之前該 thread 的 urb 結束掉才行囉，這樣多個 thread "interleave urbs transfer" 的本意就沒了啊！

所以結論是：若想以 thread 的方式使用 usblib-0.1.12，大概自己要動手改 usb_urb_transfer() 的部份，否則就直接用 usblib-1.0.0 。

2009年1月20日星期二

使用cscope 取代 source insight 以瀏覽 kernel source

解決linux下代碼查看問題

在 Windows 下面我們有 Source Insight 可以方便的瀏覽大工程中的代碼，切換到 Linux 環境下開發時，我們也可以搭建一個這樣的環境。下面的內容將介紹如何搭建這樣一個開發環境(這裡我們假設讀者已經熟悉 emacs 的安裝和配置)。
　　步驟一安裝下列軟件
　
　1)cscope ：cscope是一個代碼瀏覽工具，它可以幫你在一個大的工程中，
快速定位到一個函數/變量的聲明位置，所有引用地方等，它可以結合vim和emacs一起使用。
單獨使用cscope時不同文件間的跳轉變得很難處理，這裡我們介紹cscope在emacs環境中的使用，它需要預先建立索引檔：根據
cscope.files 的內容來建立索引：cscope.[in][out]。

　　步驟二修改或創建.emacs文件
　　;;加載我們需要的plugin (使用cscope的必備動作)
(load-file "/usr/share/emacs/site-lisp/xcscope.el")
(require 'xcscope)
(setq cscope-do-not-update-database "t") ;; 這行後面會解釋
(setq cscope-set-initial-directory "./") ;; 在現有目錄下找 cscope.out

步驟三添加工程：
　　假設我們要把/home/tom/src/linux-2.6.23的源代碼做出cscope索引，我們可以這樣做，

#>make ARCH=arm cscope 這裡 ARCH 可以等於 arm, x86, mips, ...等等你想要的 CPU arch。

或手動自己來：

　　1)#>cd /home/tom/src/linux-2.6.23 進入源代碼根目錄;
　　2)#>touch cscope.sh 創建一個腳本文件，內容如下

#!/bin/bash
LNX=`pwd`
ARCH=arm
cd /
find $LNX/ \
-path "$LNX/arch/*" ! -path "$LNX/arch/$ARCH*" -prune -o \
-path "$LNX/include/asm-*" ! -path "$LNX/include/asm-$ARCH*" -prune -o \
-path "$LNX/tmp*" -prune -o \
-path "$LNX/Documentation*" -prune -o \
-path "$LNX/scripts*" -prune -o \
-path "$LNX/drivers*" -prune -o \
-name "*.[chxsS]" -print >$LNX/cscope.files
find $LNX/ -path "$LNX/include/asm-generic*" -name "*.[chxsS]" -print >> $LNX/cscope.files

　然後 #>cscope -b -k -q ( -q 是建立雙向鏈結，可增加搜尋速度！)，這時候要等個幾分鐘，等待索引的建立！因為針對 ARCH=arm 來做，所以建出以 arm 為主的索引檔。

成功後執行 cscope.sh 腳本。

　　步驟四：關於cscope代碼瀏覽命令
C-c s a 設定初始化的目錄(cscope-set-initial-directory) ，一般是你代碼的根目錄，為了省事，我們可以把這一行命令在進入 emacs 就執行，像剛剛的 .emacs 範例一樣。
C-c s I (i 大寫) 對目錄中的相關文件建立列表並進行索引。內定會自動建立索引。記得嗎？我們剛剛做過索引了，並用 -q 參數，所以這裡不但不要用這個指令，(在 emacs 外面用 -q 建索引就好)，還應該要 disable 自動建索引的功能，因為在 emacs 裡面會主動呼叫 cscope 建立索引。所以剛剛我們修改了(setq cscope-do-not-update-database "t")。

若不這樣改，在自動建立索引時會抱怨 -q 與 database 不合的警告，重建的 database 也將喪失雙向搜尋的能力，用起來會變得蠻慢的喔！因為 kernel symbol 實在太多了！

            C-c s s 序找符號
            C-c s g 尋找全局的定義(即是 cscope-find-global-definition)
            C-c s c 看看指定函數被哪些函數所調用
            C-c s C 看看指定函數調用了哪些函數
            C-c s e 尋找正則表達式
            C-c s f 尋找文件
            C-c s i 看看指定的文件被哪些文件include
            C-c s u 回到上一個 symbol (即cscope-pop-mark)

結論：Source Insight 雖然直覺好用，但是你還得找到 Windows 環境才能 run，用wine 模擬 Windows 是一個辦法啦！這裡只是提供了『純 Linux 環境』的做法，供大家參考！

補充：用 etags 配合使用，更方便：(假設已經裝好 ctags 套件)

       1)   在進入 emacs 之前先用 etags -R 建立 TAG 檔。
在 emacs 裏面的指令：
       2) M-x visit-tags-table 會詢問是否用 default TAG file? 按 y 即可。
       3) M-. 找定義，以游標所在位置的變數來找。
       4) M-* 返回。
       5)   C-u M-. 尋找標籤的下一個定義。

補充：修改 .emacs 取代常用的 cscope 命令，在 (require 'xcscope) 後加上
(define-key global-map [f5] 'cscope-find-this-file)
(define-key global-map [f6] 'cscope-find-this-symbol)
(define-key global-map [f7] 'cscope-pop-mark)
(define-key global-map [f8] 'cscope-find-global-definition)
(define-key global-map [f9] 'cscope-find-global-definition-no-prompting)
(define-key global-map [M-up] 'cscope-prev-symbol)
(define-key global-map [M-down] 'cscope-next-symbol)
(define-key global-map [f12] 'c-down-conditional-with-else)
(define-key global-map [M-f12] 'c-up-conditional-with-else)

這樣，就可以用 F5, F6, F7, F8, F9 Esc-↑ Esc-↓ 來 browse code，您可以把常用的 key 如法泡製。

訂閱：文章 (Atom)

追求新境界

2011年3月28日星期一

simple & clean daemon code

2011年3月11日星期五

linux serial port programming : CLOCAL 的設定

2010年11月19日星期五

一些 C preprocessor 的用法

2009年12月3日星期四

sparse type checking

2009年9月3日星期四

kernel 隨手記 --- 類似 oops 的系統訊息

2009年8月4日星期二

More on QEMU

2009年5月26日星期二

RCU 初步認識

2009年4月20日星期一

慣C用法

2009年2月15日星期日

libusb 的底層追蹤 (libusb thread support and the relation with kernel usbfs)

2009年1月20日星期二

使用cscope 取代 source insight 以瀏覽 kernel source

追蹤者

網誌存檔

2011年3月28日 星期一

2011年3月11日 星期五

2010年11月19日 星期五

2009年12月3日 星期四

2009年9月3日 星期四

2009年8月4日 星期二

2009年5月26日 星期二

2009年4月20日 星期一

2009年2月15日 星期日

2009年1月20日 星期二

2011年3月28日星期一

2011年3月11日星期五

2010年11月19日星期五

2009年12月3日星期四

2009年9月3日星期四

2009年8月4日星期二

2009年5月26日星期二

2009年4月20日星期一

2009年2月15日星期日

2009年1月20日星期二