Skip to content

插件集成

Prometheus

Prometheus是一个开源的监控和警报工具集,目标是提供可靠的实时监控和警报,专为动态的云环境和微服务架构设计。

bash
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/prometheus.yaml

在istio中,通过开启指标合并prometheus.io注解添加到sidecar中来设置收集指标(若已存在会被覆盖),并通过svc:15020/stats/prometheus暴露给prometheus收集。

配置istio开启指标合并

bash
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: istio
  namespace: istio-system
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
    defaultProviders:
      metrics:
      - prometheus
    enablePrometheusMerge: true
    rootNamespace: istio-system
    trustDomain: cluster.local
EOF

prometheus通过注解对服务指标进行抓取

yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    istio.io/rev: default
    kubectl.kubernetes.io/default-container: tcp-echo
    kubectl.kubernetes.io/default-logs-container: tcp-echo
    prometheus.io/path: /stats/prometheus
    prometheus.io/port: "15020"
    prometheus.io/scrape: "true"# 是否要抓取该服务的数据
    sidecar.istio.io/status: '{"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["workload-socket","credential-socket","workload-certs","istio-envoy","istio-data","istio-podinfo","istio-token","istiod-ca-cert"],"imagePullSecrets":null,"revision":"default"}'
  creationTimestamp: "2024-01-29T01:41:57Z"
  generateName: tcp-echo-54b7ccdc9-
  labels:
    app: tcp-echo
    pod-template-hash: 54b7ccdc9
    security.istio.io/tlsMode: istio
    service.istio.io/canonical-name: tcp-echo
    service.istio.io/canonical-revision: v1
    version: v1
  name: tcp-echo-54b7ccdc9-m4g6t
  namespace: test

修改svc,基于NodePort对外暴露

bash
$ kubectl -n istio-system patch svc prometheus --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30009}]'

浏览器访问IP:30009查看prometheus服务能否正常响应

prometheus

Grafana

Grafana是一个开源的数据可视化和监控平台,用于创建和共享动态、交互式的仪表板。

bash
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/grafana.yaml

修改svc,基于NodePort对外暴露

bash
$ kubectl -n istio-system patch svc grafana --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30000}]'

浏览器访问IP:30000查看grafana服务能否正常响应

grafana

OpenTelemetry

OpenTelemetry是一个可观测性框架和工具包,旨在创建和管理遥测数据,例如跟踪指标日志,它可以与jaeger、prometheus等可观测性工具一起使用。其中OpenTelemetry Collector(otel)用于接收、处理和导出遥测数据,并持久化到指定组件

部署otel-collector组件用来收集envoy的访问日志

bash
$ kubectl -n istio-system apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/open-telemetry/otel.yaml

修改istio配置,用来支持otel

bash
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: istio
  namespace: istio-system
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
    defaultProviders:
      metrics:
      - prometheus
    enablePrometheusMerge: true
    accessLogFile: /dev/stdout
    extensionProviders:
    - name: otel
      envoyOtelAls:
        service: opentelemetry-collector.istio-system
        port: 4317
    rootNamespace: istio-system
    trustDomain: cluster.local
EOF

Grafana-Loki

一个由Grafana Labs开发的开源日志聚合系统,旨在为云原生架构提供高效的日志处理解决方案。作用与ELK中的logstash类似,用来存储日志,基于标签提供日志查询的接口和数据过滤等功能。

安装loki组件,并修改otel-collector相关配置

bash
$ kubectl -n istio-system apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/loki.yaml
$ kubectl -n istio-system apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/open-telemetry/loki/otel.yaml

修改istio的配置

bash
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: istio
  namespace: istio-system
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
    defaultProviders:
      metrics:
      - prometheus
    enablePrometheusMerge: true
    accessLogFile: /dev/stdout
    extensionProviders:
    - name: otel
      envoyOtelAls:
        service: opentelemetry-collector.istio-system
        port: 4317
        logFormat:
          text: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %UPSTREAM_LOCAL_ADDRESS%"
          labels:
              pod: "%ENVIRONMENT(POD_NAME)%"
              namespace: "%ENVIRONMENT(POD_NAMESPACE)%"
              cluster: "%ENVIRONMENT(ISTIO_META_CLUSTER_ID)%"
              mesh: "%ENVIRONMENT(ISTIO_META_MESH_ID)%"
    rootNamespace: istio-system
    trustDomain: cluster.local
EOF

logFormat.labels的作用就是给输出到loki的日志添加标签,在grafana的界面可以通过标签检索到对应的日志。

修改grafana和loki的svc为NodePort并对外暴露30000和30010端口

bash
$ kubectl -n istio-system patch svc grafana --type='json' -p='[
  {"op":"replace","path":"/spec/type","value":"NodePort"},
  {"op":"replace","path":"/spec/ports/0/nodePort","value":30000}
]'
$ kubectl -n istio-system patch svc loki --type='json' -p='[
  {"op":"replace","path":"/spec/type","value":"NodePort"},
  {"op":"replace","path":"/spec/ports/0/nodePort","value":30010}
]'

Zipkin

信息

Zipkin是一个分布式追踪系统。它有助于收集排除服务体系结构中的延迟问题所需的定时数据。功能包括数据收集和查找。

bash
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/extras/zipkin.yaml

因为使用了kind部署测试环境,之前开放了docker的30000-30020端口,所以这里需要修改zipkin的NodePort端口为30014

bash
$ kubectl -n istio-system patch svc zipkin --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30014}]'

zipkin追踪数据采集支持2种方式,分别通过defaultConfigextensionProviders方式。

bash
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: istio-system
  name: istio
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      tracing:
        zipkin:
          address: zipkin.istio-system:9411
        sampling: 100
    defaultProviders:
      metrics:
      - prometheus
    enablePrometheusMerge: true
    enableTracing: true
    accessLogFile: /dev/stdout
    rootNamespace: istio-system
    trustDomain: cluster.local
EOF
bash
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: istio-system
  name: istio
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      tracing: {} # 禁用默认的tracing配置
    defaultProviders:
      metrics:
      - prometheus
    enablePrometheusMerge: true
    enableTracing: true
    accessLogFile: /dev/stdout
    extensionProviders:
    - name: zipkin
      zipkin:
        service: zipkin.istio-system.svc.cluster.local
        port: 9411
        maxTagLength: 256 # span标签长度(默认256)
    rootNamespace: istio-system
    trustDomain: cluster.local
---
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: mesh-default
  namespace: istio-system
spec:
  tracing:
    - providers:
        - name: "zipkin"
      randomSamplingPercentage: 100.0 # 采样率
EOF
  1. 需要通过设置tracing: {}来禁用全局tracing的默认配置:zipkin:9411
  2. 全局默认方式配置需要重启对应的服务,插件则不需要重启,但需要额外配置Telemetry

访问节点IP:30014来查看zipkin页面

zipkin2

Jaeger

信息

Jaeger是一个开源的、端到端的分布式追踪系统,用于监视和调试复杂的微服务架构。

bash
$ kubectl delete -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/extras/zipkin.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/jaeger.yaml

修改jaeger svc类型为NodePort

bash
$ kubectl -n istio-system patch svc tracing --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30014}]'

因为jaeger和zipkin存在相同的service名称,先删除zipkin后再安装jaeger

和zipkin一样,jaeger也支持全局和插件的方式修改配置,将追踪数据推送到jaeger。

bash
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: istio
  namespace: istio-system
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      tracing:
        zipkin:
          address: jaeger-collector.istio-system:9411
        sampling: 100.0
    defaultProviders:
      metrics:
      - prometheus
    enablePrometheusMerge: true
    enableTracing: true
    accessLogFile: /dev/stdout
    rootNamespace: istio-system
    trustDomain: cluster.local
EOF
bash
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: istio-system
  name: istio
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      tracing: {} # 禁用默认的tracing配置
    defaultProviders:
      metrics:
      - prometheus
    enablePrometheusMerge: true
    enableTracing: true
    accessLogFile: /dev/stdout
    extensionProviders:
    - name: zipkin
      zipkin:
        service: zipkin.istio-system.svc.cluster.local
        port: 9411
        maxTagLength: 256 # span标签长度(默认256)
    rootNamespace: istio-system
    trustDomain: cluster.local
---
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: mesh-default
  namespace: istio-system
spec:
  tracing:
    - providers:
        - name: "zipkin"
      randomSamplingPercentage: 100.0 # 采样率
EOF
  1. jaeger通过全局配置方式修改后需要重启对应的服务。
  2. jaeger实现了zipkin的api,并对外暴露了名为zipkin的service,所以地址仍是zipkin.istio-system.

访问节点IP:30014来查看jaeger页面

Kiali

信息

kiali是一个用于在微服务架构中提供监控和可观测性的开源工具。通过提供可视化的监控和分析工具,Kiali使用户能够快速诊断问题优化性能,并确保微服务系统的可靠运行。

bash
# depend on prometheus & kiali & grafana & jaeger
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/prometheus.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/grafana.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/jaeger.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/kiali.yaml

kiali的分布式追踪依靠jaeger来实现

bash
# 修改kiali配置
$ kubectl -n istio-system edit cm kiali -o yaml
# 修改如下配置
external_services:
  grafana:
    enabled: true
    url: 'http://grafana.istio-system:3000/'
  tracing:
    enable: true
    use_grpc: false
    in_cluster_url: "http://tracing.istio-system/"
  prometheus:
    enable: true
    url: "http://prometheus.istio-system:9090/"

更多操作查看文档操作教程

修改kiali svc类型为NodePort

bash
$ kubectl -n istio-system patch svc kiali --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30017}]'

访问IP:30017查看kiali页面

kiali

Skywalking

Apache SkyWalking 是一个专门设计用于微服务、 云原生和容器等架构的应用性能监控 (APM) 系统。SkyWalking是可观测性的一站式解决方案, 不仅具有像Jaeger和Zipkin的分布式追踪能力,像Prometheus和Grafana的指标能力,像Kiali的日志记录能力,还能将可观测性扩展到许多其他场景, 例如将日志与链路关联,收集系统事件并将事件与指标关联,基于eBPF的服务性能分析等。

安装skywalking

bash
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/extras/skywalking.yaml

修改istio配置,向skywalking发送链路追踪数据

bash
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: istio
  namespace: istio-system
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      envoyAccessLogService:
        address: skywalking-oap.istio-system:11800
    defaultProviders:
      metrics:
      - prometheus
      tracing:
      - skywalking
    enablePrometheusMerge: true
    enableEnvoyAccessLogService: true
    enableTracing: true
    accessLogFile: /dev/stdout
    extensionProviders:
    - name: skywalking
      skywalking:
        service: tracing.istio-system
        port: 11800
    - name: otel
      envoyOtelAls:
        service: opentelemetry-collector.istio-system
        port: 4317
    rootNamespace: istio-system
    trustDomain: cluster.local
EOF

开启链路追踪功能,并指定采样率

bash
$ kubectl apply -f - <<EOF
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: mesh-default
  namespace: istio-system
spec:
  tracing:
  - providers:
        - name: "skywalking"
    randomSamplingPercentage: 100.0
EOF

修改skywalking的svc为NodePort

bash
$ kubectl -n istio-system patch svc skywalking-ui --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30008}]'

访问IP:30008查看skywalking的web页面

skywalking

分布式追踪系统对比

类别Apache SkyWalkingJaegerZipkin
实现方式基于语言的探针、服务网格探针、eBPF agent、第三方指标库(当前支持 Zipkin)基于语言的探针基于语言的探针
数据存储ES、H2、MySQL、TiDB、Sharding-sphere、BanyanDBES、MySQL、Cassandra、内存ES、MySQL、Cassandra、内存
支持语言Java、Rust、PHP、NodeJS、Go、Python、C++、.NET、LuaJava、Go、Python、NodeJS、C#、PHP、Ruby、C++Java、Go、Python、NodeJS、C#、PHP、Ruby、C++
发起者个人UberTwitter
治理方式Apache FoundationCNCFCNCF
版本9.3.01.39.02.23.19

数据来源