이쿠의 슬기로운 개발생활

함께 성장하기 위한 보안 개발자 EverNote 내용 공유

Kubernetes/Monitoring

kubernetes 모니터링 : process-exporter 란?

이쿠우우 2024. 9. 22. 21:02
반응형

Process Exporter

 

process-exporter 란?

System Metrics 중 "process" 관련 지표 정보를 수집할 수 있음.
Prometheus와 연동 가능
Prometheus에서 관리하는 프로젝트는 아님.
MIT license로 제공
사용 case는 많지 않고... 관련 자료도 거의 없음.
 

수집하는 metrics 계층

=  System Metrics
Instanace에서 실행 중인 Process에 대한 "system metrics"정보를 수집하여 API로 제공.
 

수집하는 metrics 정보 확인 링크

[process-export의 metrics list 정보]
별도로 없음.
 
직접 정리하면 아래와 같음.
# HELP namedprocess_namegroup_context_switches_total Context switches
# HELP namedprocess_namegroup_cpu_seconds_total Cpu user usage in seconds
# HELP namedprocess_namegroup_major_page_faults_total Major page faults
# HELP namedprocess_namegroup_memory_bytes number of bytes of memory in use
# HELP namedprocess_namegroup_minor_page_faults_total Minor page faults
# HELP namedprocess_namegroup_num_procs number of processes in this group
# HELP namedprocess_namegroup_num_threads Number of threads
# HELP namedprocess_namegroup_oldest_start_time_seconds start time in seconds since 1970/01/01 of oldest process in group
# HELP namedprocess_namegroup_open_filedesc number of open file descriptors for this group
# HELP namedprocess_namegroup_read_bytes_total number of bytes read by this group
# HELP namedprocess_namegroup_states Number of processes in states Running, Sleeping, Waiting, Zombie, or Other
# HELP namedprocess_namegroup_thread_context_switches_total Context switches for these threads
# HELP namedprocess_namegroup_thread_count Number of threads in this group with same threadname
# HELP namedprocess_namegroup_thread_cpu_seconds_total Cpu user/system usage in seconds
# HELP namedprocess_namegroup_thread_io_bytes_total number of bytes read/written by these threads
# HELP namedprocess_namegroup_thread_major_page_faults_total Major page faults for these threads
# HELP namedprocess_namegroup_thread_minor_page_faults_total Minor page faults for these threads
# HELP namedprocess_namegroup_threads_wchan Number of threads in this group waiting on each wchan
# HELP namedprocess_namegroup_worst_fd_ratio the worst (closest to 1) ratio between open fds and max fds among all procs in this group
# HELP namedprocess_namegroup_write_bytes_total number of bytes written by this group
# HELP namedprocess_scrape_errors general scrape errors: no proc metrics collected during a cycle
# HELP namedprocess_scrape_partial_errors incremented each time a tracked proc's metrics collection fails partially, e.g. unreadable I/O stats
# HELP namedprocess_scrape_procread_errors incremented each time a proc's metrics collection fails
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# HELP process_exporter_build_info A metric with a constant '1' value labeled by version, revision, branch, and goversion from which process_exporter was built.
# HELP process_max_fds Maximum number of open file descriptors.
# HELP process_open_fds Number of open file descriptors.
# HELP process_resident_memory_bytes Resident memory size in bytes.
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 
 

수집한 metrics 정보를 전달하는 방식

=  Pull 방식
node-exporter는 HTTP 통신을 통해 Prometheus Server와 같은 metrics server가 process-exporter가 수집한 Metric Data를 가져갈 수 있게 
/metrics 라는 HTTP 엔드포인트를 제공함.
exporter가 해당 엔드포인트를 제공하고 있어서 Server가 exporter의 엔드포인트로 HTTP GET 요청을 날려 Metric Data를 Pull방식으로 수집함.
 

 

 
process-exporter 배포 방법
 
process-exporter.yaml
 
설정파일
apiVersion: v1
kind: ConfigMap
metadata:
  name: process-exporter-conf
  labels:
    name: process-exporter-conf
  namespace: monitoring
data:
  process-exporter-config.yml: |-
    process_names:
    - name: "{{.Comm}}"
      cmdline:
      - '.+'
 
실행
apiVersion: v1
kind: ConfigMap
metadata:
  name: process-exporter-conf
  labels:
    name: process-exporter-conf
  namespace: monitoring
data:
  process-exporter-config.yml: |-
    process_names:
    - name: "{{.Comm}}"
      cmdline:
      - '.+'
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app.kubernetes.io/component: process-exporter
    app.kubernetes.io/name: process-exporter
  name: process-exporter
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: process-exporter
      app.kubernetes.io/name: process-exporter
  template:
    metadata:
      labels:
        app.kubernetes.io/component: process-exporter
        app.kubernetes.io/name: process-exporter
    spec:
      containers:
      - args:
        - -procfs=/host/proc
        - -config.path=/etc/process-exporter/process-exporter-config.yml
        name: process-exporter
        image: ncabatoff/process-exporter
        securityContext:
          privileged: true
        ports:
          - containerPort: 9256
            protocol: TCP
        resources:
          limits:
            cpu: 250m
            memory: 180Mi
          requests:
            cpu: 102m
            memory: 180Mi
        volumeMounts:
          - mountPath: /host/proc
            name: proc-volume
            readOnly: true
          - name: process-exporter-config-volume
            mountPath: /etc/process-exporter/
      volumes:
        - name: proc-volume
          hostPath:
            path: /proc
            type: ""
        - name: process-exporter-config-volume
          configMap:
            defaultMode: 420
            name: process-exporter-conf
---
kind: Service
apiVersion: v1
metadata:
  name: process-exporter
  namespace: monitoring
  annotations:
      prometheus.io/scrape: 'true'
      prometheus.io/port:  '9256'
spec:
  selector:
      app.kubernetes.io/component: process-exporter
      app.kubernetes.io/name: process-exporter
  ports:
  - name: node-exporter
    protocol: TCP
    port: 9256
    targetPort: 9256
 
참고
 
반응형