Serverless controller does not serve time-critical requests from users. It reconciles Function custom resources (CR), stored at the Kubernetes API Server, and has no persistent state on its own.
Serverless controller doesn't build or serve Functions using its allocated runtime resources. It delegates this work to the dedicated Kubernetes workloads. It schedules (build-time) jobs to build the Function Docker image and (runtime) Pods to serve them once they are built. Refer to the architecture diagram for more details.
Having this in mind Serverless controller does not require horizontal scaling.
It scales vertically up to the
160Mi of memory and
500m of CPU time.
Limitation for the number of Functions
There is no upper limit of Functions that can be run on Kyma (similar to Kubernetes workloads in general). Once a user defines a Function, its build jobs and runtime Pods will always be requested by Serverless controller. It's up to Kubernetes to schedule them based on the available memory and CPU time on the Kubernetes worker nodes. This is determined mainly by the number of the Kubernetes worker nodes (and the node auto-scaling capabilities) and their computational capacity.
Build phase limitation:
The time necessary to build Function depends on:
- selected build profile that determines the requested resources (and their limits) for the build phase
- number and size of dependencies that must be downloaded and bundled into the Function image.
- cluster nodes specification (see the note with reference specification at the end of the article)
The shortest build time (the limit) is approximately 15 seconds and requires no limitation of the build job resources and a minimum number of dependencies that are pulled in during the build phase.
Running multiple Function build jobs at once (especially with no limits) may drain the cluster resources. To mitigate such risk, there is an additional limit of 5 simultaneous Function builds. If a sixth one is scheduled, it is built once there is a vacancy in the build queue.
This limitation is configurable using
Runtime phase limitations
In the runtime, the Functions serve user-provided logic wrapped in the WEB framework (
express for Node.js and
bottle for Python). Taking the user logic aside, those frameworks have limitations and depend on the selected runtime profile and the Kubernetes nodes specification (see the note with reference specification at the end of this article).
The following describes the response times of the selected runtime profiles for a "hello world" Function requested at 50 requests/second. This describes the overhead of the serving framework itself. Any user logic added on top of that will add extra milliseconds and must be profiled separately.
Obviously, the bigger the runtime profile, the more resources are available to serve the response quicker. Consider these limits of the serving layer as a baseline - as this does not take your Function logic into account.
Function runtime Pods can be scaled horizontally from zero up to the limits of the available resources at the Kubernetes worker nodes. See the Use external scalers tutorial for more information.
In-cluster Docker registry limitations
Serverless comes with an in-cluster Docker registry for the Function images. This registry is only suitable for development because of its limitations, i.e.:
- Registry capacity is limited to 20GB
- There is no image lifecycle management. Once an image is stored in the registry, it stays there until it is manually removed.
NOTE: All measurements were done on Kubernetes with five AWS worker nodes of type
m5.xlarge(four CPU 3.1 GHz x86_64 cores, 16 GiB memory).