wiki:GSoC/2014/ParavirtualizationOfRTEMS

Version 2 (modified by Youren Shen, on Aug 17, 2014 at 4:53:21 PM) (diff)

GSOC 2014 - Paravirtualization of RTEMS

The project this year continues with the consequent of last year. We share the same goal, to introduce a virtualization layer into RTEMS. The different between this year and last year is that we focus on hypervisor more this year. We designed two mechanism to connect guest OS and host OS. The one is hypercall to send request from guest OS to host OS. And the notification mechanism to send request from host OS to guest OS. By designing these two mechanisms in POK, the RTEMS will co-work well with POK. You can find the Wikipedia page for last year.

And the proposal of this year.

The source code can be found in my github repository.

Partitioned OS kernel – POK

The POK kernel is an partitioned OS based on ARINC 653 compliant. The target of us is adapt the POK kernel to an hypervisor to fit RTEMS paravirtualization. To adapt the POK kernel, the essential premise is understood how the POK worked. = The POK startup flow =

How to startup on x86 architecture is a common sense. But we still should focus on how POK set GDT, to know the privilege level and segment setting. Also it’s important to understand interrupt handling mechanism in POK when dealing with interrupt delivery.

pok_ret_t pok_arch_init ()
{
  pok_gdt_init ();
  pok_event_init ();
  return (POK_ERRNO_OK);
}

The POK initialize the GDT and IDT in this two function. For more details, please see this two blogs.POK Startup Flow and The syscall system in POK= The POK context switch function =

This function is interesting because it’s different with other operating system. It will used the structure context_t to emulate the interrupt and interrupt return behavior.

For more detail please see this blog.

The separation between POK kernel and virtualization

To separate the POK BSP to x86-qemu and x86-qemu-vmm will benefit to the paravirtualization, on the point that the change of x86-qemu-vmm will not influence the normal x86-qemu, and also the change of x86-qemu-vmm will be more clear. Where to change: There are two situations. Firstly , on some files there are only some of lines which are for virtualization only. In this situation, we use the macro POK_NEEDS_X86_VMM to control compiler. Secondly, if one file is for virtualization only, then using macro in Makefile. Here are two example:= Separate in one source file =

In arch.c, the function pok_arch_event_register derive to types, one is for x86-qemu, the other, of course is for x86-qemu-vmm.

diff --git a/kernel/arch/x86/arch.c b/kernel/arch/x86/arch.c
index 917f3f3..1d54dda 100644
--- a/kernel/arch/x86/arch.c
+++ b/kernel/arch/x86/arch.c
@@ -58,6 +58,7 @@ pok_ret_t pok_arch_idle()
 }
 
 
+#ifdef POK_NEEDS_X86_VMM
 extern void pok_irq_prologue_0(void);
 extern void pok_irq_prologue_1(void);
 extern void pok_irq_prologue_2(void);
@@ -119,6 +120,20 @@ pok_ret_t pok_arch_event_register  (uint8_t vector,
   }
 }
 
+#else
+pok_ret_t pok_arch_event_register  (uint8_t vector,
+                                    void (*handler)(void))
+{
+  pok_idt_set_gate (vector,
+                   GDT_CORE_CODE_SEGMENT << 3,
+               (uint32_t) handler,
+                   IDTE_TRAP,
+                   3);
+
+  return (POK_ERRNO_OK);
+}
+#endif /* POK_NEEDS_X86_VMM */
}}}=  Separate in Makefile  =

If one whole file is for virtualization only, we can change the Makefile to separate it. Here is an example:
{{{
diff --git a/kernel/arch/x86/Makefile b/kernel/arch/x86/Makefile
index c486d47..f9cba01 100644
--- a/kernel/arch/x86/Makefile
+++ b/kernel/arch/x86/Makefile
@@ -13,10 +13,14 @@ LO_OBJS=   arch.o      \
            space.o     \
            syscalls.o  \
            interrupt.o \
-          interrupt_prologue.o    \
            pci.o       \
            exceptions.o
 
+ifeq ($(BSP),x86-qemu-vmm)
+
+LO_OBJS+= interrupt_prologue.o
+
+endif
 LO_DEPS=   $(BSP)/$(BSP).lo
 
 all: $(LO_TARGET)
}}}
Now I hope it's clear enough, and the change will be used in the next steps.
=  Hypercall  =

The hypercall, as a mechanism imitate from syscall, is an way to using the hypervisor’s resources or notify hypervisor some events.
Here is the change make in POK kernel. 
 
# Add a pok_hypercall_init in pok_event_init, also should add POK_NEES_X86_VMM to guard this function.    
#. Add the two head file, kernel/include/core/hypercall.h and libpok/include/core/hypercall.h, and build the corresponding structure and declaration.   
#. implement the corresponding functions in corresponding .c files. That is:   
#. kernel/arch/x86/hypercall.c, using this file to build the hypercall_gate.
#. kernel/core/hypercall.c, in this file, the pok_core_hypercall will deal with the hypercall.
#. modify the kernel/include/arch/x86/interrupt.h, add the support of hypercall handler.
#. add libpok/arch/x86/hypercall.c, in this file, we implement the pok_do_hypercall, which will invoke the soft interrupt.
#. modified interrelated Makefile to assure those file will work when the BSP is x86-qemu-vmm, also will not influence the normal POK, when the BSP is not x86-qemu-vmm.

For more details please see this [http://huaiyusched.github.io/2014/05/30/build-a-new-hypercall-system-by-imitating-the-syscall/ blog].=  vCPU in partition  =


The vcpu is part of schedule in VMM, to manage the processor, and the arch-dependent structure (arch-vcpu) is relevant with current partition.    
As a result, first, the whole structure of vcpu is part of processor management, should be placed in kernel.    

I build a new file vcpu.h in kernel/include, and put the vcpu structure definition in it. Then build a arch_vcpu.h in kernel/arch/x86, and put the arch_vcpu in it. in this file, I use the context_t in this structure to contain user_regs.   
Also in the arch_vcpu, I put a irq_desc struct, to store interrupt information.   
Then I builds a new file vcpu.c in kernel/core, and implement the alloc_vcpu function in this file. This function relies on some arch-dependent functions, like alloc_vcpu_struct and vcpu_initialize function. So I build a new file arch_vcpu.c in pok_kernel/arch/x86, and put the arch-dependent functions in. 

Also, I modify some file, like pok/kernel/include/core/partition.h. In this file, I planed to add a vcpu list head in partitions. Another file modified is kernel/core/sched.c, In this file, I add some empty function, because the schedule for vcpu is not necessary. 

Finally, I add the alloc_vcpu in partition_init. 
All the function will be test in this week.

There are something should be noted: 
#The space alloced by alloc_vcpu_struct can not be free. So once the vcpu has been alloced, it can't be destroyed. As a result, the vcpu can be dynamic. So maybe we can alloc it in aadl file in the future. 
#In the function vcpu_initialize, we planed to alloc schedule function, but as for now, the schedule for vcpu is not essential, so the function is empty for now. 
#The function declarations in head files is omited in this blog.
New files
#pok/kernel/core/vcpu.c
#pok/kernel/arch/x86/arch_vcpu.c
#pok/kernle/include/core/vcpu.h
#pok/kernel/include/arch/x86/arch_vcpu.h
Modified files
#pok/kernel/core/sched.c
#pok/kernel/include/partition.h
Reused structure
#The context_t is reused in arch_vcpu, to put the user_regs.
#The interrupt_frame is reused in arch_vcpu, to put the interrupt information.

For more details please see this [http://huaiyusched.github.io/2014/06/10/the-design-of-vcpu-in-pok/ blog].
=  Register interrupt handler for vCPU  =


The Guest OS should register interrupt handler first, we should replace all native interrupt function in RTEMS as this Register function for vCPU in paravirtualization layer.

This function is implement by Hypercall. We add a new Hypercall, and implement the core function.
=  Add a new hypercall  =

New hypercall number
{{{
POK_HYPERCALL_IRQ_REGISTER_VCPU          =  30,
   POK_HYPERCALL_IRQ_UNREGISTER_VCPU       =  31,
</code>
New case in pok_core_hypercall
{{{
pok_ret_t pok_core_hypercall (const pok_hypercall_id_t       hypercall_id,
                            const pok_hypercall_args_t*    args,
                            const pok_hypercall_info_t*    infos)
{
....
  /* register interrupt delivery to vcpu */
   case POK_HYPERCALL_IRQ_REGISTER_VCPU:
       return pok_bsp_irq_register_vcpu(args->arg1,(void(*)(uint8_t)) ((args->arg2 + infos->base_addr)));
       break;
   /* unregister interrupt delivery to vcpu */
   case POK_HYPERCALL_IRQ_UNREGISTER_VCPU:
       return pok_bsp_irq_unregister_vcpu(args->arg1);
       break;
....
}
}}}

For more details please see this [http://huaiyusched.github.io/2014/07/29/the-interrupt-register-function-for-vcpu blog].
}}}